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The  purpose  of  this  project  was  to  study  the  reliability  of 

factors  of  parent-infant  interaction  through  the  use  of  generalizability 

theory.    A  review  of  the  literature  indicated  that  coefficients  of 

intercoder  agreement  were  being  reported  as  indications  of  reliability 

for  measures  derived  through  systematic  observation  even  though  two 

separate  studies  published  15  years  ago  had  shown  that  method  to  be 

inadequate. 

Twenty-eight  white,  middle-class  parents  and  their  first-bom 
infants  (14  male,  14  female)  were  video-taped  in  a  structured  teaching 
situation  in  a  laboratory  setting  at  19,  25,  37  and  43  weeks  of  age. 
Mother  and  father  were  taped  for  three  minutes  each  while  engaging 
their  infant  in  a  structured  task  thought  to  be  appropriate  to  the  age 
of  the  infant.    Two  trained  observers  coded  the  video-tapes  using  the 
Reciprocal  Category  System.    Thirty-two  parent-infant  interaction 
measures  accounting  for  84  percent  of  the  total  interaction  tallies 

xiv 


were  factor  analyzed  and  the  five  factors  were  rotated  to  the  Varimax 
criterion.    Scores  were  generated  based  on  the  resulting  five  factors  and 
the  intercoder  agreement  and  reliability  of  these  scores  were  analyzed 
using  generalizability  theory. 

Three  separate  analyses  were  done:     one  for  intercoder  agreement 
and  two  for  reliability.    One  reliability  analysis  used  a  design  where 
subjects  were  crossed  with  coders  and  the  other  used  a  design  where 
subjects  were  nested  within  coders. 

The  results  of  the  intercoder  agreement  analysis  indicated  that 
four  of  the  five  factors  showed  problems  for  two  occasions.     For  a 
third  occasion  the  intercoder  agreement  coefficients  were  above  .70 
for  all  five  factors.    The  results  of  the  two  reliability  studies 
indicated  that  the  value  of  the  generalizability  coefficients  for  two 
of  the  five  factors  were  satisfactory  when  generalization  was  intended 
to  all  parent-infant  pairs  and  all  coders,  but  to  only  one  occasion. 
A  third  factor  was  shown  to  be  moderately  satisfactory  under  the  same 
conditions  and  two  were  judged  as  unsatisfactory.     In  other  words, 
three  of  the  factor  scores  of  parent-infant  interaction  were  reliable 
only  for  mother  or  father  playing  with  their  infant  at  a  specific 
occasion;  two  were  not  satisfactory  even  under  those  limited  conditions. 
Observer  disagreement  and  lack  of  variance  between  subjects  were  cited 
as  problems  before  37  weeks,  and  lack  of  item  consistency  was  thought 
to  be  a  problem  for  three  of  the  factors. 

The  investigator  concluded  that  a  design  where  subjects  were  nested 
within  coders,  used  in  combination  with  a  design  where  a  portion  of  the 
subjects  were  crossed  with  coders,  provided  extensive  and  valuable 
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information  about  the  reliability  of  factors  derived  through  the  use 
of  systematic  observation.     It  was  demonstrated  that  researchers  can 
use  systematic  observation  and  report  detailed  reliability  data  without 
going  to  the  extra  expense  of  having  every  subject  coded  by  two 
observers  at  every  session.     Suggestions  for  future  research  on 
parent-infant  interaction  were  outlined. 
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CHAPTER  I 
INTRODUCTION 

During  the  past  15  years  we  have  witnessed  a  significant  increase 
in  the  number  of  studies  attempting  to  ascertain  the  important  vari- 
ables related  to  infant  competence.     Lewis  (1967)  pointed  out  that 
this  is  not  a  new  interest  to  social  scientists,  but  rather  the  inves- 
tigation of  old  questions  by  means  of  more  sophisticated  techniques, 
procedures  and  measures.    One  of  the  major  changes  has  been  in  types 
of  strategies  for  data  collection.    As  the  problems  of  measures  of 
self-report  have  been  reported  in  the  literature  (e.g..  Yarrow,  1963), 
investigators  have  increasingly  turned  to  direct  observation  (Clark- 
Stewart,  1973).    As  is  often  the  case,  change  in  one  aspect  of  an  investi- 
gation necessitates  that  other  changes  be  made  as  well.     For  ex- 
ample, reliability  of  measures  is  always  an  important  issue  and,  al- 
though a  large  body  of  psychometric  theory  on  relability  of  paper- 
and-pencil  instruments  exists,  this  same  issue  has  only  recently  begun 
to  be  addressed  in  systematic  observation. 

Statement  of  the  Problem 
Medley  and  Mitzel  (1963)  and  Cronbach,  Rajaratnam  and  Gleser 
(1963)  have  proposed  very  similar  methods  of  analyzing  reliability  of 
observational  measures,  both  of  which  use  ANOVA  intraclass  correlation 
coefficients  as  estimates  of  reliability.     Rowley  (1976)  has  observed 
that  the  major  advantage  of  the  variance  components  approach  which  has 
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been  proposed  is  that  it  "enables  the  researcher  to  pinpoint  multiple 

sources  of  error  and  to  compute  a  number  of  different  reliability 

coefficients  for  different  purposes."     (p.  51)    One  difference  is 

that  Medley  and  Mitzel  (1963)  only  discuss  matched  data  (every 

subject  rated  by  all  coders)  whereas  Cronbach,  Gleser,  Nanda  and  Rajaratnam 

(1972)  have  extended  their  theory  to  unmatched  ratings  (all  coders 

do  not  rate  each  subject).     This  is  important  since  the  required  extra 

coding  for  matched  data  for  an  entire  project  is  probably  one  reason 

why  most  researchers  are  still  only  reporting  intercoder  agreement  of 

specific  sessions  as  measures  of  reliability  rather  than  utilizing  a  more 

extensive  analysis. 

A  second  reason  for  the  paucity  of  extensive  analysis  of  relia- 
bility in  studies  using  systematic  observation  is  that  a  user  of  an 
observation  instrument  is  not  normally  concerned  with  reliability  be- 
yond the  extent  that  unreliable  data  will  obscure  otherwise  valid 
relationships  (Rowley,  1976) .    He  states  that  when  one  is  speaking 
of  the  reliability  of  a  test,  "it  is  usually  fairly  clear  that  the  term 
'reliability'  refers  to  the  scores  obtained  by  some  sample  of  examines 
on  that  test."     (p.  52)    The  developer  of  atestis  expected  to  analyze  its 
reliability  since  reliability  is  considered  to  be  a  property  of  the 
instrument  itself.     Therefore,  it  is  assumed  that  all  investigators 
will  get  the  same  measures  if  they  use  a  reliable  instrument  on  a 
sample  of  the  same  population. 

Rowley  (1976)  goes  on  to  observe  that  in  the  context  of 
systematic  observation,  "it  has  frequently  been  asserted  that 
reliability  is  a  desirable  property;  it  has  not  always  been  clear 
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just  what  it  is  that  is  supposed  to  possess  this  attribute."  He, 

therefore,  makes  the  following  distinctions: 

Observation  instrument  -  a  set  of  procedures  whereby  an  observer 
can  record  and  categorize  the  behavior  of  a  subject  or  subjects. 
It  normally  consists  of  a  number  of  items ,  to  which  the  observer 
responds  in  some  way  dependent  on  the  behavior  he  has  observed. 

Observation  record  -  a  set  of  data  (usually  in  the  form  of  symbols) 
which  describes  the  behavior  of  one  or  more  subjects  during  one 
or  more  periods  of  observation. 

Observation  measures  -  a  procedure  for  using  an  observation  record 
to  assign  scores  to  each  of  the  subjects  of  observation;  each 
score  so  assigned  is  assumed  to  reflect  some  characteristic  of 
that  subject,   (p.  52) 

Since  single  measures  are  very  seldom  used  by  themselves,  I  would 

make  the  following  additional  distinction: 

Observation  composite  or  factor  -  a  procedure  for  combining  measure 
scores  to  assign  composite  or  factor  scores  to  each  of  the  subjects; 
it  is  also  assumed  that  these  scores  reflect  some  characteristic 
of  the  behavior  of  that  subject. 

Reliability,  then,  is  a  property  of  a  measure,  composite  or 
factor  score  and  not  of  an  instrument  or  record  (Rowley,  1976) .  It 
is  possible  that  a  single  observation  instrument  could  produce  scores 
which  are  reliable  and  also  scores  that  are  unreliable.     For  example, 
if  one  wanted  to  investigate  the  relationship  of  parent  verbal 
behavior  to  infant  competence,  one  could  decide  to  count  specific 
behaviors  such  as  parent  expresses  positive  affect,  asks  a  question, 
gives  instructions,  responds  to  child's  request,  corrects  child's 
behavior,  etc.     The  instruction  to  the  coder  on  what  to  look  for  and 
how  to  code  it  (i.e.,  every  time  the  behavior  occurs,  once  per  time 
period,  etc.)  would  constitute  the  instrument.    When  the  coder  actually 
observed  the  parent-infant  interaction  and  wrote  something  on  a  code 
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sheet,  an  observation  record  vvould  be  produced.     From  that  record 
several  different  measures,  composites  or  factors  could  be  produced 
by  counting  the  frequencies  of  items  individually,  by  adding  certain 
items  together  based  upon  previous  literature,  by  factor  analysis, 
etc.    Each  measure,  composite  or  factor  score  is  a  separate  entity 
and  could    prove  to  be  either  reliable  or  unreliable. 

Medley  and  Mitzel  (1963)  have  defined  reliability  of  a  measure 
as  follows: 

a  measure  is  reliable  to  the  extent  that  the  average 
difference  between  two  measurements  independently  obtained 
in  the  same  classroom  is  smaller  than  the  average  difference 
between  two  measures  obtained  in  different  classrooms,   (p.  250) 

By  this  is  meant  that  if  observer  A  were  to  code  a  teacher's  classroom 

on  several  occasions  and  observer  B  were  to  code  the  same  classroom, 

though    on  different  occasions,  the  difference  between  the  average 

measure  would  be  smaller  than  if  observer  B  had  coded  a  different 

classroom. 

Using  this  definition  of  reliability,  unreliability  can  come 
about  in  two  ways: 

1)  Two  measures  of  the  same  subject  (or  classroom,  parent,  etc.) 
tend  to  differ  too  much  because: 

a)  Behavior  of  the  subject  is  unstable; 

b)  Observers  are  unable  to  agree  on  what  occurs; 

c)  Different  items  which  enter  into  the  measurement  lack 
consistency,  etc. 

2)  Differences  between  subjects  are  too  small. 
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Both  item  consistency  and  inter- rater  agreement  are  important 
issues  to  be  addressed,  but  they  are  not  the  most  important  sources 
of  error  variance.  The  most  important  source  of  error  variance  is 
the  variability  of  the  behavior  of  the  subject  of  observation 
(McGaw,  Wardrop  and  Bunda,  1972),  which  must  be  stable  for  a  given 
subject  and  must  vary  among  the  subjects  if  reliable  scores  are  to 
be  obtained. 

Generalizability  Theory 
As  stated  previously  Medley  and  Mitzel  [1963)  and  Cronbach  et  al. 
(1963)  have  published  detailed  examples  on  the  use  of  ANOVA  as  a 
technique  for  investigating  reliability  of  systematic  observation 
measures.    Cronbach  et  al.   (1963)  have  labeled  their  theory  generali- 
zability theory  which  is 

based  on  the  premise  that  any  rating  of  an  individual  is  only 
one  of  a  population  of  ratings  that  might  be  made  for  that 
individual.    The  ideal  datum  would  be  the  average  of  the 
population  of  the  ratings,  the  universe  score.    The  question 
of  reliability  is  posed  as  the  question  of  the  generalizability 
of  the  obtained  rating  to  the  universe  score.    One  method  of 
indexing  the  accuracy  of  the  generalization  is  a  generalizability 
coefficient,  the  ratio  of  the  universe  score  variance  to  observed- 
score  variance.     (Algina,  1978,  p.  135) 

Cronbach  et  al.   (1972)  have  distinguished  between  a  generalizabil- 
ity study  (G  study)  which  is  done  for  the  purpose  of  investigating  the 
relation  between  observed  score  and  the  universe  score,  and  a  decision 
study  (D  study)  from  which  decisions  are  made  as  to  the  relationship 
of  the  score  under  study  to  other  measures.    Stated  another  way  the 
purpose  of  a  G  study  is  to  investigate  the  reliability  of  specific 
scores  which  have  been  derived  from  an  observation  record.     Each  spe- 
cific score  would  only  be  a  sample  of  all  universe  scores  that  might 
be  obtained  from  a  specific  observation  record.    A  D  study  could  then 
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use  the  scores  investigated  in  the  G  study  to  investigate  those 
scores'  relationship  with  certain  outcome  or  product  variables. 
Systematic  Observation  of  Parent- Infant  Interaction 

Investigators  using  systematic  observation  to  study  parent- 
infant  interaction  have  not  proven  to  be  an  exception  with  respect  to 
reporting  intercoder  agreement  coefficients  as  estimates  of  reliability. 
In  a  review  of  50  observational  studies  (Lytton,  1971),  there  were  13 
applicable  to  infancy,  none  of  which  reported  more  than  intercoder 
agreement.     In  a  later  compilation  of  73  instruments  used  in  child 
development  (Boyer,  Simon  and  Karafin,  1973),  10  were  applicable  to 
infancy  and  none  of  these  reported  more  than  intercoder  agreement. 
From  several  major  publications  in  the  area  of  development  psychology 
from  1973  to  the  present  (i.e..  Child  Development,  Development  Psychol- 
ogy, Monographs  of  the  Society  for  Research  in  Child  Development) ,  13 
studies  of  parent-infant  interaction  were  noted  and  again,  none  of 
these  reported  more  than  intercoder  agreement. 

Gordon  (1974)  investigated  mother- infant  interaction  using  three 
different  observation  instruments  developed  in  three  separate 
projects  (Escalona  and  Gorman,  1974;  Gordon  and  Jester,  1972;  Watts 
and  Barnett,  1971).     Gordon's  study  was  a  direct  response  to  earlier 
appeals  for  cross-validation  of  findings  derived  from  the  use  of 
home-grown  instrtunents  (Ad  Hoc  Committee  on  Child  Mental  Health,  1971; 
Sparling  and  Gallagher,  1971).    These  same  three  instruments  are 
presently  being  used  in  a  longitudinal  study  of  mother-  and  father- 
infant  interaction  in  an  attempt  to  extend  previously  found  relationships 
between  parent- infant  interaction  and  infant  competence  (Gordon  and 
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Soar,  in  progress) .    However,  intercoder  agreement  is  the  most  that 
has  ever  been  reported  as  estimates  of  reliability  for  any  scores 
derived  from  these  instruments. 

As  important  as  it  is  to  identify  measures,  composites  or  factors 
of  parent- infant  interaction  through  repeated  use  of  observation 
instruments,  it  is  even  more  important  to  investigate  repeatedly 
the  reliability  of  those  scores  since  without  this  information  one 
does  not  know  if  nonsignificant  prediction  of  infant  competence  is  a 
function  of  no  real  relationship  or  the  result  of  nonreliable  data. 
In  addition,  the  extent  to  which  the  measures,  composites  or  factors 
are  generalizable  should  tell  us  something  about  parent-infant  inter- 
action separate  and  apart  form  whether  or  not  those  scores  predict 
infant  competence. 

Purpose  of  Study 
The  purpose  of  this  study,  then,  is  to  investigate  the  relia- 
bility of  parent-infant  interaction  through  the  use  of  generaliza- 
bility  theory.     Important  variables  which  have  been  found  to  relate 
to  either  parent  or  infant  behavior  will  be  used  to  define  the 
different  facets  of  the  study.    Several  different  designs  will  be 
used  in  an  attempt  to  determine  the  feasibility  of  eliminating  the 
requirement  to  use  matched  data  when  investigating  reliability  of 
data  which  are  to  be  used  in  a  decision  study.     Since  this  study 
may  be  some  readers'  first  introduction  to  generalizability  theory, 
a  Glossary  is  provided.     It  is  expected  that  the  results  of  this 
study  will  contribute  to  the  theory  of  parent-infant  interaction  as 
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well  as  the  design  of  future  decision  studies  relating  parent-infant 
interaction  to  infant  competence. 


CHAPTER  II 
REVIEW  OF  THE  LITERATURE 

There  are  three  major  areas  of  research  to  be  reviewed.     First  is 
the  literature  on  parent-infant  interaction;  second  is  the  literature 
on  systematic  observation;  and  third  is  the  literature  on  generalizabil- 
ity  theory. 

Parent-Infant  Interaction 

Most  of  the  research  that  has  been  done  on  parent-infant  inter- 
action during  the  first  year  of  life  has  been  limited  to  the  study  of 
the  mother-infant  dyad.    The  exclusive  study  of  this  dyad  assumes  that 
it  is  unique  (Ainsworth,  1973;  Bowlby,  1951;  Stern,  1974)  and  the  fore- 
runner of  later  social  relationships  (Kogan,  Wimberger  and  Bobbitt,  1969); 
the  father-infant  relationship  has  been  severely  neglected  (Lamb,  1975). 
However,  recent  research  has  indicated  that  the  father-infant  relation- 
ship is  established  early  and  contributes  significantly  to  the  develop- 
ment of  the  child. 

In  one  recent  study.  Lamb  (1977)  observed  20  infants  (10  boys, 

10  girls)  interacting  with  their  mothers  and  fathers  at  home  when  they 

were  7,  8,  12,  and  13  months  of  age.     The  families  were  described  as  a 

representative  sample  of  young,  intact,  and  stable  lower-  to  upper- 
middle-class  families  within  which  parental  and  marital  roles  were 
traditionally  allocated. (p.  169) 

The  major  variables  of  interest  were  affiliative  behaviors 
(smiling,  vocalizing,  looking,  laughing  and  preferring)  and  attachment 
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behaviors  (proximity,  touching,  approaching,  seeking  to  be  held, 
fussing  and  reaching).     Lamb  found  that  infants  showed  no  clear  pref- 
erence for  either  parent  in  the  display  of  attachment  behaviors,  even 
though  both  parents  were  consistently  differentiated  from  a  relatively 
unfamiliar  investigator  on  these  measures.    This  was  taken  as  an  in- 
dication that  infants  were  clearly  attached  to  both  parents  at  a  very 
early  age.    Preference  in  the  display  of  affiliative  behaviors  were 
explained  largely  by  differences  in  the  degree  of  adult  activity  in 
interaction  with  the  infants.    As  the  infants  grew  older,  they  were 
increasingly  likely  to  direct  affiliative  behaviors  to  both  the  parents 
and  the  investigator. 

Additional  evidence  that  father-infant  interaction  should  be 
studied  is  provided  by  a  review  of  four  different  experimental  studies 
of  infant  attachment  to  both  mother  and  father  (Kotelchuck,  1976) .  He 
reported  that  even  though  the  exact  number  of  children  who  prefer 
mothers  or  fathers  varies,  depending  on  the  measure  chosen,  approximate- 
ly 55%  of  the  12-  to  21-month-old  children  showed  maternal  preferences, 
20%  showed  joint  preferences,  and  25%  showed  paternal  preferences.  The 
data  from  these  studies  were  gathered  in  the  laboratory,  in  the  home 
and  cross-culturally.     In  view  of  this  research,  Bowlby's  (1969)  con- 
clusion that  the  infant  is  monotropical ly  (relates  to  one  person) 
matricentric  (relates  to  mother)  seems  to  need  some  modification. 

Additional  support  that  the  father  should  be  included  in  patterns  of 
studies  of  parent-infant  interaction  comes  from  the  fact  that  sex  of  par- 

is  one  of  the  most  consistently  reported  variables  that  has  been  found 
to  influence  these  patterns.     In  a  study  of  24  4-  and  8-month-old  infants 
in  11  Israeli  kibbutzim,  Gewirtz  and  Gewirtz  (1968)  found  that  the  infant 
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sees  his/her  mother  for  at  least  twice  as  much  time  as  he/she  does 
the  father.    However,  when  the  type  of  situation  was  divided  into 
"caretaking"  and  "social,"  it  was  found  that  the  differences  were 
accounted  for  by  the  fact  that  fathers  spent  very  little  time  in  care- 
taking  activities. 

Rebelsky  and  Hanks  (1971),  in  a  study  of  10  2-week  to  3-month-old 
infants  (7  male,  3  female)  born  into  white,  lower-middle-  to  upper- 
middle-class  families,  found  that  fathers  talk  infrequently  and  for 
short  periods  of  time  to  their  infants  during  the  first  three  months 
of  life.    Also,  fathers  tend  to  vocalize  differently  with  their  male 
and  female  infants,  which  when  compared  to  similar  data  for  mothers 
(Moss,  1967),  suggests  that  it  is  just  the  opposite  of  the  mother- 
infant  pattern.    Rebelsky  and  Hanks 's  data  indicate  that  at  2  weeks 
and  4  weeks  of  age  fathers  of  female  infants  tend  to  verbalize  more 
while  Moss's  data  show  that  mothers  of  male  infants  verbalize  more 
at  3  weeks.    By  the  time  the  infants  are  three  months  of  age,  these 
patterns  are  reversed.     Fathers  of  male  infants  vocalize  somewhat  more 
whereas  mothers  of  female  infants  tend  to  vocalize  more. 

Additional  evidence  that  mothers  and  fathers  differ  in  their 
interactions  with  infants  is  provided  by  several  additional  studies.  An 
informal  study  by  Biller  (1974)  as  reported  in  Lamb  (1976c)  suggests 
that  whereas  mothers  were  more  likely  to  inhibit  a  child's  exploration, 
fathers  encouraged  their  infant's  curiosity  and  urged  them  to  attempt 
to  solve  cognitive  and  motoric  challenges. 
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Lamb  (1976d)  found  that  mothers  held  their  infants  most  often  to 
engage  in  caretaking  functions,  while  fathers  held  them  most  often  to 
play.     Kotelchuck  (1976)  also  found  that  fathers  spent  a  larger  per- 
centage of  their  time  in  playful  behavior  with  their  infants  than  did 
mothers . 

The  major  variables  that  have  been  found  to  correlate  consistently 
with  infant  behavior  are  age  and  sex  of  the  infant.     Kagan  (1971),  in 
a  study  of  180  white,  firstborn  infants,  found  discontinuities  in 
behavior  when  assessing  infants  on  a  wide  variety  of  measures  at  4,  8, 
13,  and  27  months.    Gordon  and  Jester  (1972)  found  very  similar  dis- 
continuities with  a  sample  of  black  infants  born  into  low-income 
families.    Emde,  Gaensbauer  and  Harmon  (1976)  found  that  changes  in 
unexplained  fussiness,  wakefulness,  and  the  emergence  of  infant  social 
behaviors  combined  to  distinguish  three  separate  periods  during  the 
infant's  first  year  of  life:     1)  Birth  to  the  end  of  the  second  month, 
2)  third  month  through  the  sixth  month,  and  3)  seventh  month  through 
the  twelfth  month.     Bell  and  Harper  (1977),  in  a  review  of  the  litera- 
ture of  discontinuity  in  infancy,  concurred  with  these  findings,  adding 
that  the  increased  ability  of  the  infant  to  react  to  the  stimulation 
of  the  environment  is  also  a  basis  for  change. 

These  discontinuities  in  behavior  have  been  replicated  in  cross- 
cultural  research  by  Lusk  and  Lewis  (1972)  who  studied  10  IVolof  infants 
in  Senegal.     That  these  discontinuities  result  in  altered  mother-infant 
behavior  patterns  was  shown  by  Crawley,  Rodgers,  Freidman,  lacobbo, 
Criticos,  Richardson  and  Thompson  (1978)  who  studied  48  4-,  6-,  and  8-mon 
old  infants  and  their  mothers  in  a  laboratory  free-play  situation.  Mothei 
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of  4-months-old  typically  played  games  conducive  to  the  direct  stimu- 
lation of  infant  attentional  and  positive  affect  responses.     In  con- 
trast, mothers  of  8-month-olds  incorporated  games  possessing  a  conven- 
tional motoric  role  that  could  readily  be  assumed  by  her  infant. 

With  regard  to  infant  sex,  Clark-Stewart  (1973),  in  a  longitudinal 
study  of  36  low-income  mothers  and  their  first-bom,  normal  infants 
(9-18  months  old),  found  that  boys  became  increasingly  more  object 
oriented  with  age  whereas  girls  became  increasingly  more  socially 
oriented.     Goldberg  and  Lewis  (1969)  observed  64  13-month-old  infants 
(32  male,  32  female)  with  their  mothers  in  a  standardized  laboratory 
free-play  situation  and  found  striking  sex  differences.     Boys  were  more 
independent,  showed  more  exploratory  behavior,  played  with  toys  re- 
quiring gross  motor  activity,  were  more  vigorous,  and  tended  to  run 
and  bang  in  their  play.    Girls  were  more  dependent,  showed  less  explor- 
atory behavior,  and  their  play  reflected  a  more  quiet  style.     Moss  (1967) 
also  found  that  at  age  three  months  boys  slept  less  and  cried  more  than 
girls. 

Type  of  situation  is  another  variable  which  has  been  found  to  in- 
fluence parent-infant  interaction.     Rebelsky  and  Hanks  (1971)  found 
that  fathers  decreased  their  verbalizations  with  their  infants  during 
caretaking  activities.     Lamb  (1976a)  showed  that  infants'  affiliative 
behaviors  towards  their  parents  changed  when  a  stranger  entered  the 
room.     Clark-Stewart  (1973)  found  that  parent-infant  interaction  pat- 
terns were  different  in  free-play  and  structured  situations.  However, 
Peterson  (1975),  in  a  study  of  20  white,  middle-  and  working-class 
mothers  and  their  12-  to  16-month-old  infants,  found  many  categories 
of  interactive  behavior  similar  from  home  to  a  laboratory  settine 
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Several  parent  behaviors  which  have  been  found  to  contribute  to 
infant  competence  are  stimulation,  responsiveness,  acceptance  of  child's 
behavior  and  appropriateness  of  parents'  behavior  for  age  and  ability 
of  the  child  (Bell  and  Ainsworth,  1972;  Bromwich,  1976;  Clark-Stewart, 
1973;  Gordon,  1974;  Kagan,   1971;  Lewis  and  Goldberg,  1969).  Bakow, 
Sameroff,  Kelly  and  Zax  (1973)  and  Clark-Stewart  (1973)  have  shown  that 
these  parent  behaviors  tend  to  cluster  together;  parents  scoring  high 
on  one  generally  score  high  on  all.    Ability  to  sustain  interaction 
sequences  (Peterson,  1975;  Watson,  1972),  amount  of  language  and  posi- 
tive feedback  produced  (Peterson,  1975)  and  the  ability  to  provide  a 
warm,  nuturant  atmosphere  (Lamb,  1976b;  Walters  and  Stinnett,  1971) 
have  also  been  found  to  relate  positively  to  infant  competence. 

Systematic  Observation 

In  order  to  study  parent-infant  interaction  properly  one  must  ob- 
serve the  parent  and  infant  together  and  record  the  behavior  in  such  a 
manner  that  the  sequences  of  behavior  are  preserved  for  later  analysis 
(Gerwirtz,  1969).     However,  the  vast  majority  of  researchers  studying 
parent-infant  interaction  have  used  time-sampling  observation  systems 
in  which  the  sequencing  of  events  is  not  recorded.     In  a  review  of  50 
observational  studies  (Lytton,  1971),  there  were  13  applicable  to  in- 
fancy and  only  one  study  which  preserved  the  sequences  for  later  analysis 
(Brody,  1956).     In  a  later  compilation  of  73  observation  instruments 
used  in  child  development  (Boyer,  Simon,  and  Karafin,  1973),  10  were 
applicable  to  infancy.    Of  those,  only  three  preserved  the  parent- 
infant  sequences  for  later  analysis  (the  systems  of  Ainsworth,  Salter,  Be 
and  Stayton,  1972;  Caldwell  and  Honig,  1970;  Gordon  and  Jester,  1972). 
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The  Ainsworth,  Salter,  Bell,  and  Stayton  (1972)  system  was  intended 
for  use  in  coding  infant  attachment  and  reciprocal  maternal  behaviors 
during  naturalistic  observation  in  the  familar  home  environment. 
The  Caldwell  and  Honig  (1970) system  was  also  designed  to  be  used  in 
natural  settings  and  provides  data  on  behavior  emitted  and  received 
by  children  in  relation  to  their  peers,  their  caretakers,  their  teachers 
and  the  environment.     Both  of  the  above  named  systems  code  both  verbal 
and  nonverbal  behavior.    The  Gordon  and  Jester  (1972)  system  was  designed 
to  code  the  interaction  of  a  mother,  an  infant,  and  a  parent  educator 
in  astructured  teaching  situation.     In  the  Gordon  and  Jester  system 
adult  behavior  must  be  verbal  although  baby  behavior  may  be  either  verbal 
or  nonverbal . 

Generalizability  Theory 
The  organization  of  this  section  was  influenced  by  Llabre  (1978) . 
Classical  psychometric  theory  is  primarily  based  on  a  model  proposed 
by  Spearman  (1904)  which  states  that  a  person's  observed  score  (X)  is 
the  sum  of  two  components,  one  being  the  true  score  (T)  and  the  second 
being  undifferentiated  error  component  (E)  as  shown  below: 

X  =  T  +  E. 

Since  these  two  components  are  assumed  to  be  independent  of  each 
other,  the  variance  for  a  group  of  individuals  can  be  partitioned  into 

the  sum  of  independent  variance  components: 

2  2  2 

=         +  ag  . 

General  agreement  has  been  reached  that  the  form  of  the  definition 
of  reliability  should  be  a  ratio  of  true  score  to  observed  score 
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V  0 
XX  =  J__  , 

2  2 

but  there  is  still  some  disagreement  as  to  how         and         should  be 
computed  (Medley  and  Mitzel,  1963). 

With  respect  to  the  use  of  generalizability  theory  for  the  purpose 
of  estimating  the  reliability  of  scores  obtained  through  systematic 
observation,  the  underlying  linear  model  is 

X      =  y     +   (y     -  u)   +  (y     -  y)   +  e 

where  X^^  is  the  rating  of  the  £th  subject  by  the  rth  rater  (Algina, 

1978).  Additionally, 

Both  raters  and  subjects  are  assumed  to  be  random  samples  from 
infinite  populations  of  raters  and  subjects.  Using  e(  )  to 
represent  the  expectation  operator,  the  various  ys  are  defined  as 

M     =  eX 
P  rP^ 

y    =  eX  , 
r  pr 

P  ^ 

y  ecX 
=  pr 
rp 

The  mean  y^  as  the  expected  value  of  all  ratings  of  the  £th  sub- 
ject is  the  quantity  we  would  like  to  obtain  for  this  subject  and 
is  referred  to  as  the  universe  score.    The  residual  e      is  com- 
posed of  two  confounded  components,  ^''^ 

e      =  a      +  E 
pr       pr  pr 

where  a^^  is  the  interaction  between  the  rth  rater  and  the  pth 

subject,  and  E      is  an  error  random  variable  with  mean  zero  and 

2 

variance  a^^  .(Algina,  1978,  p.  136) 
2 

The  variance  a      is  referred  to  as  the  universe  score  variance 


(or  true  score  variance)  and  is  equal  to 
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2  2 

a  =  e(u     -  y)  . 

P  P 
^       P  ^ 


The  variance  for  raters  is 


2  2 
r 

and  the  error  variance  is 

2  2  2 

o„    =  ee  (a        +  a      ) . 

In  the  case  of  general izability  theory,  then,  the  error  variance 

2  2  2 

)  is  composed  of  variance  for  raters  (o^  )  and  error  variance  (a^  ) : 

2  2  2 

o„    =  a      +  a 
E         r  e 

There  are  two  separate  definitions  of  observed-score  variance 

derivable  from  the  above  definitions  and  equations.    The  first  is 

2  2  2  2  2 

Oi     =  e(X      -m)     =a      +e(a      +ct  ) 

x\r     ^    pr  V          p             pr  pr 

and  the  second  is 

2  2           2          2  2: 

a      =  ee(X      -y)=a      +a  +o 
a:  pr  p        r  e 

pr    ^  ^ 

2 

Since  the  variance  for  raters  (a^  )  is  very  difficult  to  compute, 

Cronbach  et  al .   (1963,  1972)  defined  a  generalizability  coefficient 

2 

(p  )  as  follows: 

2  2  2  2         2  2 

P      =  o     lea  I      =  a    /  (a      +  a    ) , 
P         p       x\r        p       p         e  ^' 

which  Rajaratnam,  Cronbach  and  Gleser  (1965)  showed  to  be  a  lower  bound 

2 

for  ep 

r 

As  observed  by  Llabre  (1978),  Brennan  (1975)  extended  the  rationale 
of  estimating  reliability  through  the  use  of  generalizability  theory 
to  a  split-plot  factorial  design  where  students  were  nested  within 
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classes.  He  compared  the  generalizability  coefficients  derived  from 
the  split-plot  design  to  coefficients  derived  when  the  nesting  clas- 
sification was  ignored  (i.e.,  a  randomized  block  design)  and  found  that 

if  one  uses  a  randomized  block  design  to  calculate  reliability 
for  persons  when,  in  fact,  persons  are  nested  within  some  di- 
mension, such  as  schools  or  classrooms,  the  resulting  coefficient 
will  be  biased,  and,  moreover,  the  direction  of  bias  will  be  un- 
known. (Brennan,  1975,  p.  785) 

As  also  noted  by  Llabre  (1978),  Kane  and  Brennan  (1977)  ex- 
tended the  use  of  the  split-plot  design  to  a  mixed  model  (i.e.,  a 
model  having  both  random  and  fixed  facets).    They  showed  that  dif- 
ferent coefficients  could  be  generated  depending  upon  the  definition 
of  universe  score  and,  hence,  a  different  definition  of  error.  In 
each  case,  however,  the  definition  of  observed  score  was  identical. 
This  observed  score  was  simply  the  sum  of  the  variances  of  the  separate 
effects,  eliminating  any  effects  that  were  common  to  all  classes  or 
subjects . 

Summary 

In  the  review  of  the  literature  for  parent-infant  interaction  it 
was  shown  that  there  are  four  facets  which  have  been  found  to  influence 
parent -infant  interaction.     They  are  sex  of  parent,  sex  of  infant,  age 
of  infant,  and  type  of  situation  or  task.    Additionally,  the  facet  of 
coder  has  been  shown  to  be  an  important  consideration  when  using  sys- 
tematic observation. 

The  review  of  literature  for  systematic  observation  has  shown  that 
even  though  the  coding  of  sequences  of  behavior  is  necessary  in  order 
to  properly  study  parent-infant  interaction,  as  of  the  last  major  re- 
view (Boyer,  Simon  and  Karafin,  1973)  only  three  systems  had  done  so 
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(the  systems  of  Ainsworth,  Salter,  Bell  and  Stayton,  1972;  Caldwell  and  Honig 
1970;  Gordon  and  Jester,  1972).    Additionally,  none  of  these  investi- 
gators have  reported  more  than  intercoder  agreement  as  measures  of 
reliability. 

The  review  of  the  literature  for  generalizability  theory  has  shown 

that  the  classical  reliability  coefficient  defined  as  the  ratio  of  true 

2  2 
score  variance  (a    )  to  observed  score  variance  (a  ) 

1  A 

P  = 

XX 


is  defined  in  generalizability  theory  as  the  ratio  of  universe  score 

2  P  2 

variance  (o  )  to  observed  score  variance  (  x  )>  and  can  be  estimated 
through  computation  of  a  generalizability  coefficient 

2  2  2  2 

p      =  a    /(a      +  a  ) 
P         P       P         e  ^ 

2 

where         is  the  generalizability  coefficient  for  the  rating  of  a  sub- 

2  _  2 
ject,  a      is  the  variance  component  for  the  subjects,  and  a      is  the 

variance  component  for  the  experimental  error  associated  with  that 
rating.    Additionally,  it  was  stated  that  the  use  of  ANOVA  is  the  ac- 
cepted method  of  computing  estimates  of  variance  components. 


CHAPTER  III 
DESIGN 

The  purpose  of  this  study  was  to  investigate  the  reliability  of 
parent-infant  interaction  through  the  use  of  generalizability  theory. 
Video-tapes  of  parents  interacting  with  their  infants  in  a  structured 
teaching  situation  in  a  laboratory  setting  were  made  on  four  separate 
occasions.    The  sample,  specific  objectives,  and  data  gathering  and 
analysis  procedures  are  described  in  this  chapter. 

Sample 

The  sample  consisted  of  28  white,  first-born  infants  falling  with- 
in the  normal  physical  range  as  determined  by  physical  examinations  at 
age  three  months.     There  were  14  male  and  14  female  infants.  This 
sample  was  recruited  for  a  larger,  longitudinal  project  (Gordon  and 
Soar,  in  progress)  via  local  radio,  community  and  campus  newspapers, 
commercial  and  public  television  and  local  pediatricians.  Socio- 
economic status  was  determined  by  using  the  Two  Factor  Index  of  Social 
Position  (Hollingshead,  1957).    The  numbers  of  families  in  each  of  the 
classifications  are  shown  in  Table  1. 

Procedure 

Each  of  the  28  families  was  video-taped  for  seven  observations 
scheduled  six  weeks  apart  beginning  at  13  weeks  and  terminating  at  49 
weeks.    Only  the  data  from  the  19  week,  25  week,  37  week  and  43  week 
visits  were  used  in  this  analysis.    At  each  visit  the  parents  were 
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Table  1 

Number  of  Families  for  Each  Classification 
of  Hollingshead's  Index  of  Social  Position 


I  Major  Professionals  and  Executives  6 

II  Lesser  Professionals  and  Managers  9 

III  Clerical,  Sales  and  Technical  12 

IV  Skilled  Manual  Employees  1 
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presented  with  a  specific  task  taken  from  Gordon's  (1970)  Baby  Learning 
Through  Baby  Play,  thought  to  be  appropriate  for  the  age  of  the  child 
(see  Appendix).    All  families  at  each  age  were  presented  with  the  same 
activity.    The  parents  were  asked  to  read  the  instructions  and  to  then 
use  the  ideas  in  interacting  with  the  baby. 

Each  session  consisted  of  a  total  of  nine  minutes:    three  minutes 
were  mother-infant,  three  minutes  were  father-infant,  and  three  minutes 
were  both  parents -infant .     The  order  of  parents  interacting  with  their 
infants  was  randomly  assigned.    Only  the  mother-infant  and  father- 
infant  data  were  used  in  this  study. 

The  video-taping  took  place  in  a  studio  on  the  University  of 
Florida  campus  that  had  been  temporarily  set-up  for  this  purpose.  It 
consisted  of  a  12'  x  12'  enclosure  with  small  holes  cut  in  several 
panels,  three  black-and-white  cameras,  monitors,  camera  mixer,  and  two 
1/2"  reel-to-reel  Sony  recorders. 

The  Bayley  Mental  Development  Quotient  (Bayley,  1969)  was  admin- 
istered at  age  12  months. 

Specific  Objectives 

The  specific  objectives  of  this  study  were  to  determine  the  extent 
to  which  factor  scores  of  parent-infant  interaction  are  generalizable 
across  different  levels  of  the  facets  which  have  been  proposed  as  af- 
fecting those  scores. 

Objective  1:    To  determine  the  intercoder  agreement  for  factor 
scores  of  parent-infant  interaction  at  particular  occasions. 

Objective  2:    To  determine  the  reliability  of  factor  scores  of 
parent-infant  interaction  when  subjects  are  crossed  with  coders  (i.e.. 
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all  subjects  are  observed  by  both  coders). 

Objective  3:    To  determine  the  reliability  of  factor  scores  of 
parent -infant  interaction  when  subjects  are  nested  within  coders  (i.e., 

each  coder  observes  only  one  group  of  subjects). 

Method  of  Analysis 

Observer  Training 

Two  observers  were  trained  to  code  the  Reciprocal  Category  System 
(RCS)  until  the  measures  of  a  single  family  on  each  of  the  separate 
items  in  the  observation  instrument  were  within  two  tallies  of  each 
other.     Parent -infant  interaction  measures  for  the  RCS  are  normally 
tallied  in  such  a  manner  that  each  entry  in  the  record  is  paired  with 
the  entry  immediately  following  it.     This  produces  a  28  x  28  matrix, 
of  which  most  entries  are  zero.     Since  at  the  time  of  the  training  of 
the  coders  it  was  not  known  which  interaction  measures  would  be  used  to 
form  factor  scores  nor  in  what  manner  the  factor  scores  would  be  formed, 
it  simply  was  not  feasible  to  work  with  anything  more  than  the  separate 
items.    The  check  on  the  training  procedure  was  simply  an  attempt  to 
produce  an  accurate  observation  record  from  which  observation  measures 
could  be  produced.    Periodic  checking  was  done  to  insure  that  the  coders 
did  not  "drift"  from  their  original  agreements. 
RCS  Measures 

The  Reciprocal  Category  System  (RCS)  which  is  being  used  in  the 
present  study  is  a  slight  modification  of  that  used  by  Gordon  and  Jester 
(1972).     It  traces  its  history  back  through  Ober,  Wood  and  Roberts 
(1968),  and  Flanders  (1965)  to  Bales'  Interaction  Process  Analysis  (1951), 
The  RCS  consists  of  28  categories,  the  first  digit  of  which  signifies  the 
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actor  (infant,  mother,  father)  and  the  second  digit  signifies  the 
behavior  (see  Table  2).    Adult  behavior  must  be  verbal  in  nature,  but 
infant  behavior  may  be  either  physical  or  verbal.     Behavior  is  coded 
as  it  occurs  with  a  minimum  of  one  code  every  three  seconds. 

From  an  analysis  of  data  gathered  in  a  previous  project  (Gordon, 
1974)  and  a  preliminary  review  of  some  of  the  data,  32  measures  ac- 
counting for  84%  of  the  total  tallies  were  identified  for  use  in 
further  analysis  (see  Table  3).     It  was  assumed  that  the  raw  data 
would  not  meet  the  restricted  assumptions  of  the  statistical  procedures 
to  be  used  in  latter  analysis.     Therefore,  the  data  were  area  trans- 
formed (making  the  distribution  as  normal  as  possible)  and  t-scored 
measures  were  produced  having  a  mean  of  50  and  a  standard  deviation  of 
approximately  10.    The  measures  were  then  factor  analyzed  and  rotated 
to  the  Varimax  criterion  using  a  factor  analysis  program  from  the 
University  of  Florida  Educational  Evaluation  Library  as  modified  by 
L.  B.  Stebbins.     Incomplete  factor  scores  were  produced  for  each 
rotated  factor  by  weighting  each  variable  either  one  or  zero  with  a 
cutoff  of  .3999  (see  Horn,  1965).    Any  measure  which  loaded  above  the 
cutoff  on  two  factors  was  loaded  on  both. 
Intercoder  Agreement 

An  intercoder  agreement  analysis  was  done  for  three  separate  oc- 
casions and  was  the  method  of  analysis  for  Objective  1.     At  25  weeks 
there  were  14  families  (6  boys,  8  girls);  at  37  weeks  there  were  18 
families  (12  boys,  6  girls);  at  43  weeks  there  were  17  families  (11 
boys,  6  girls).    Only  six  families  were  coded  at  all  three  sessions. 

Each  session  was  analyzed  separately  using  a  randomized  block 
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Table  2 

Reciprocal  Category  System 
Description  of  Behavior 


Baby 

01  WARMS 

02  ACCEPTS 

03  AMPLIFIES 

04  ELICITS 

05  RESPONDS 

06  INITIATES 

07  DIRECTS 

08  CORRECTS 

09  COOLS 

10  SILENCE 
CONFUSION 


BABY 

SLEEPING 


Baby  smiles,  laughs,  gurgles,  cooes,  etc.  Self- 
reinforcing  behavior  such  as  thumb  sucking  is  also 
warming  behavior. 

Passive  acceptance  of  situation;  the  child  does  not 
ignore  the  person  or  object  but  neither  does  he 
respond. 

Simple  imitation  or  expansion  of  behavior  which  is 
begun  by  parent. 

Attempts  to  get  parents  to  respond  which  are  made  in 
a  questionning  manner;  "Asks"  for  help  or  assistance. 

Baby  responds  appropriately  (may  be  correct  or  in- 
correct) to  parent  eliciting,  initiating  or  directing. 

Exploratory  behavior  which  has  no  observable 
antecedent. 

Any  behavior  which  attempts  to  get  adult  attention 
or  direct  adult  activity. 

Task-related  ignoring  behavior. 

Task  irrelevant,  emotional  expressions;  Active  or 
passive  agression,  crying,  hitting,  biting,  being 
uncooperative,  etc. 

Pauses,  periods  of  no  activity. 

Yawns,  sneezing,  coughing,  wetting,  "accidents"  or 
unintentional  interruptions;  Period  of  confusion 
in  which  communication  cannot  be  understood  by 
observer. 


-  Baby  goes  to  sleep. 


1  , 

may  be  either  verbal  or  non-verbal  in  nature. 


Note: 


Adapted  from  Reciprocal  Category  System  For  Use  In  the  Parent- 
Infant  Transaction  Project  by  I.  J.  Gordon  with  J.  C.  Lederman  and 
W.  G.  Huitt.     NIMH  #  1  ROL  MHIHD  27480-DI,  1976. 
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Table  2  -  continued 


Mother 
11  WARMS 


12  ACCEPTS 


13  AMPLIFIES 


14  ELICITS 


15  RESPONDS 


16  INITIATES 


17  DIRECTS 


18  CORRECTS 


Father 

Tends  to  reduce  or  release  tension  and/or  21 
alleviate  threat.    Cooing,  laughing,  clarify- 
ing and  accepting  the  feelings  and  emotions 
of  another  are  specific  examples.  Encourages 
or  praises  in  non-task  oriented  behavior.  Deals 
mainly  with  socioemotional  climate. 

Positively  reinforces  or  accepts  task  related  22 
behavior  of  another. 

Clarification  of,  building  on,  and/or  develop-  23 
ing  of  actions,  behaviors,  comments,  and/ or 
ideas. 

Asks  questions  or  requests  information  about  24 
content,  subject,  or  procedures  being  con- 
sidered, with  the  intent  that  the  other  should 
answer  and  respond  appropriately. 

Gives  answers  or  responds  to  questions  or  25 
requests  for  information  that  are  directed, 
initiated  or  elicited  by  another  person. 

Statements  of  facts,  information,  and/ or  26 
opinions  and  ideas  concerning  the  content, 
subject,  or  procedures  being  considered 
which  are  self-initiated. 

Giving  of  directions,  instructions,  orders  27 
and/or  assignments  to  which  another  is 
expected  to  reply. 

Task-related  behavior  which  tells  another  28 
that  the  answer  or  behavior  of  another  is 
inappropriate  or  incorrect. 


19  COOLS 


-  Non-task  related  behaviors  which  tend  to 
create  tension;  implied  are  efforts  toward 
sarcasm,  ridicule,  regimentation,  or  alien- 
ation of  another  (i.e.,  bawling  out  some- 
one, refecting  or  criticizing  the  opinion 
or  judgments  of  another,  or  excercising 
control  in  order  to  gain  or  maintain 
authority  in  situation). 


must  be  verbal  in  nature. 


29 
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Table  3 

Key  RCS  Variables  from 
Parent -Infant  Transaction  Project 


1  Baby  warms,  accepts;  baby  warms,  accepts 

Rows  1,2;  Col.  1,2 

2  Baby  warms,  accepts;  baby  amplifies 

Rows  1,2;  Col.  3 

3  Baby  warms,  accepts;  baby  responds 

Rows  1,2;  Col.  5 

4  Baby  warms,  accepts;  baby  initiates 

Rows  1,2;  Col.  6 

5  Baby  warms,  accepts;  parent  accepts,  amplifies 

Rows  1,  2;  Col.  12,13 

6  Baby  warms,  accepts;  parent  elicits,  initiates,  directs 

Rows  1,2;  Col.  14,16,17 

7  Baby  ampilifies;  baby  warms,  accepts 

Row  3;  Col.  1,2 

8  Baby  amplifies;  baby  amplifies 

Row  3;  Col.  3 

9  Baby  amplifies;  baby  responds 

Row  3;  Col.  5 

10  Baby  amplifies;  baby  inititates 

Row  3;  Col.  6 

11  Baby  amplifies;  parent  accepts,  amplifies 

Row  3;  Col.  12,13 

12  Baby  amplifies;  parent  elicits,  initiates,  directs 

Row  3;  Col.  14,16,17 

13  Baby  responds;  baby  warms,  accepts 

Row  5;  Col.  1,2 

14  Baby  responds;  baby  amplifies 

Row  5;  Col .  3 

15  Baby  responds;  baby  responds 

Row  5;  Col.  5 

16  Baby  responds;  baby  initiates 

Row  5;  Col.  6 
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Table  3  -  continued 


17  Baby  responds;  parent  accepts,  amplifies 

Row  5;  Col.  12,13 

18  Baby  responds;  parent  elicits,  initiates,  directs 

Row  5;  Col.  14,16,17 

19  Baby  initiates;  baby  warms,  accepts 

Row  6;  Col.  1,2 

20  Baby  initiates;  baby  amplifies 

Row  6;  Col.  3 

21  Baby  initiates;  baby  responds 

Row  6;  Col.  5 

22  Baby  initiates;  baby  initiates 

Row  6;  Col.  6 

23  Baby  initiates;  parent  accepts,  amplifies 

Row  6;  Col.  12,13 

24  Baby  initiates;  parent  elicits,  initiates,  directs 

Row  6;  Col.  14,16,17 

25  Parent  accepts,  amplifies;  baby  warms,  accepts 

Rows  12,13;  Col.  1,2 

26  Parent  accepts,  amplifies;  baby  amplifies 

Rows  12,13;  Col,  3 

27  Parent  accepts,  amplifies;  baby  responds 

Rows  12,13;  Col.  5 

28  Parent  accepts,  amplifies;  baby  initiates 

Rows  12,13;  Col.  6 

29  Parent  elicits, initiates,  directs;  baby  warms,  accepts 

Rows  14,16,17;  Col.  1,2 

30  Parent  elicits,  initiates,  directs;  baby  amplifies 

Rows  14,16,17;  Col.  3 

31  Parent  elicits,  initiates,  directs;  baby  responds 

Rows  14,16,17;  Col.  5 

32  Parent  elicits,  initiates,  directs;  baby  initiates 

Rows  14,16,17;  Col.  6 
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factorial  design  (see  Figure  1) .    The  full  model  for  this  design  is 
sho\\m  in  Table  4.     The  Expected  Mean  Squares  and  the  unbiased  Estimators 
of  the  Components  of  Variance  for  each  source  of  variance  are  shown  in 
Table  5.     Both  coders  and  subjects  were  considered  random;  Parent  Sex 
was  considered  fixed.    Throughout  this  study  fixed  facets  are  desig- 
nated by  capitalizing  the  facet  name  (i.e..  Parent  Sex);  random  facets 
are  designated  by  use  of  all  lower  case  letters  (i.e.,  coder).  The 
Expected  Mean  Squares  and  the  Estimates  of  the  Components  of  Variance 
were  con^juted  using  the  BHD  Analysis  of  Variance  Program  BMD08V 
(Dixon,  1974)  . 

The  intercoder  agreement  for  ratings  of  factor  scores  of  parent- 
infant  interaction    for  each  session  was  evaluated  through  the  calcula- 
tion of  four  generalizability  coefficients.    The  first  was        (b,  c,  P) 
which  provided  a  measure  of  intercoder  agreement  when  generalization 
was  intended  for  all  coders,  all  subjects  and  the  two  specific  levels 
of  Parent  Sex.     The  equation  for  computing  this  coefficient  is  shown 
in  Table  6. 

If  this  coefficient  was  low  it  was  necessary  to  reduce  the  universe 
of  generalization  by  looking  at  a  first  order  coefficient  of  which 
there  were  two.    The  second  coefficient  was  0^  (b,  c*,  P)  which 
provided  a  measure  of  generalizability  when  the  universe  of  generalization 
was  to  all  subjects  and  both  levels  of  Parent  Sex,  but  only  to  the  specific 
coders  used  in  this  study.    This  is  indicated  by  an  asterisk  for  the 
facet  of  coder  and  was  adapted  from  Cronbach  et  al .   (1972).  Notice 
that  the  numerator  (the  definition  of  universe  score)  changes  in  that  the 
variance  due  to  coder  x  block  is  added  to  the  variance  due  to  block 
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coder^ 

coder2 

Parent  Sex^ 

Parent  Sex^ 

Parent  Sex^ 

Parent  Sex2 

subj  ect^ 

subject2 

subject 

n 

Figure  1 


Block  Diagram  for  Randomized  Block  Factorial  Design 
for  Intercoder  Agreement  Analysis 
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Table  4 


Full  Model  and  Notation  for  Randomized  Block 
Factorial  Design  for 
Intercoder  Agreement  Analysis 


=  U+  c.  +  P.  +b,  +cP.. 
1       J        k  ij 


+cb 


ik 


+  Pb 


+  E 


ijk 


where 


U  =  grand  mean 

c.  =  effect  of  coder  i; 
1 

P.  =  effect  of  Parent  j: 
J 

=  a  constant  associated  with  block  k,  where 

a  block  is  an  infant; 

cP. .  =  effect  of  interaction  of  coder  i  with  Parent  i: 

cb.,   =  effect  of  interaction  of  coder  i  with  block  k: 
ik  ' 

Ph..   =  effect  of  interaction  of  Parent  i  with  block  k; 
jk  ' 

E^_.j^  =  experimental  error; 

and  there  are 

Q  levels  of  c. 

1 

R  levels  of  P. 


N  levels  of  b 


k 
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Table  5 

Expected  Mean  Squares  and  Estimates  of  Variance 
Components  for  Each  Source  of  Variance 


Estimate 

Source  Expected  Mean  Square  Source         of  Variance  Components 


coder 

RNo  2 
c 

+ 

R3  ,2 
cb 

d  2 

C 

1 

RN 
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but  the  denominator  (the  definition  of  observed  score)  does  not  change. 
The  generalizability  coefficient  will  increase  to  the  extent  that  ratings 
of  subjects  change    when  observed  by  either  of  the  coders.  However, 
this  coefficient  will  represent  a  smaller  universe  in  that  the  facet  of 
coder  must  be  considered  fixed  when  estimating  the  generalizability  of 
the  factor  scores.    Therefore,  it  was  only  considered  if  the  generaliz- 
ability coefficient  p2(b,c,P)  was  less  than  .70.     This  equation  is  also 
shown  in  Table  6. 

The  third   was  p^(b,c,P*),  which  provided  a  measure  of  generaliz- 
ability when  the  universe  of  generalization  was  to  all  subjects  and 
coders,  but  to  only  one  level  of  Parent  Sex.    This  equation  is  also 
shown  in  Table  6. 

If  one  of  these  latter  two  coefficients  was  still  not  acceptable, 
a  fourth  coefficient,  p2(b,c*,P*),  could  be  considered.    Tlie  universe 
of  generalization  in  this  case  was  to  all  subjects,  this  specific  set 
of  coders  and  to  a  specific  level  of  Parent  Sex.    Again,  the  equation 
is  shown  in  Table  6. 
Reliability 

Subjects  crossed  with  coders.  The  second  analysis  was  of  the 
reliability  of  factor  scores  of  parent -infant  interaction  using  matched 
data  (i.e.,  all  subjects  were  observed  by  both  coders).    This  is  the 
type  of  reliability  study  recommended  by  Medley  and  Mitzel   (1963)  and 
Cronbach  et  al.   (1963).     Six  families  were  observed  by  both  coders  for 
each  of  three  sessions  (4  boys,  2  girls). 

The  data  were  analyzed  using  a  randomized  block  factorial  design 
(see  Figure  2)  with  the  facets  being  coder.  Parent  Sex  and  Occasion. 
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Figure  2 

Block  Diagram  for  Randomized  Block  Factorial  Design 
for  Reliability  Analysis 
When  Subjects  Were  Crossed  With  Coders 
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This  analysis  allows  the  direct  comparison  of  the  relative  effects  of 
each  of  these  facets  through  the  comparison  of  variance  explained 
(i.e.,  percent  of  total  suras  of  squares)  or  through  comparison  of  the 
estimates  of  variance  components  as  well  as  the  comparison  of  the 
respective  generalizability  coefficients.    However,  all  three  of  these 
comparisons  are  fairly  crude  and  do  not  allow  for  significance  testing. 
The  full  model  for  this  design  is  shovm  in  Table  7.    The  Expected  Mean 
Squares  and  the  unbiased  Estimates  of  the  Components  of  Variance  are 
shouTi  in  Table  8.    Coders  and  subjects  were  considered  random;  Parent 
Sex  and  Occasion  were  considered  fixed.    Again,  the  Expected  Mean 
Squares  and  the  Estimates  of  the  Components  of  Variance  were  computed 
using  the  BMD  Analysis  of  Variance  Program  BMD08V  (Dixon,  1974). 

In  this  analysis,  which  was  the  method  for  analyzing  Objective  2, 
eight  different  generalizability  coefficients  were  determined  for  each 
factor.     The  first  coefficient  was  p  (s,c,P,0)  which  provided  an  esti- 
mate of  the  reliability  of  ratings  of  subjects  on  factor  scores  of 
parent-infant  interaction  when  generalization  was  intended  for  all 
coders  and  subjects,  two  specific  levels  of  Parent  Sex  and  three 
specific  Occasions.    The  equation  for  computing  this  coefficient  is  shown 
in  Table  9. 

If  this  coefficient  was  low  it  was  necessary  to  look  to  a  first- 


as 


order  coefficient  of  which  there  were  three.     The  second  w 
„  2 

P  (b,c*,P,0)  which  provided  an  estimate  of  the  reliability  of  ratings 
of  subjects  on  factor  scores  of  parent-infant  interaction  when  generali- 
zation was  intended  over  all  subjects  and  to  all  levels  of  Parent  Sex 
and  Occasion  included  in  this  analysis  but  to  only  the  specific  coders 
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Table  7 

Full  Model  and  Notation  for  Randomized 
Block  Factorial  Design 


^inVn,  =  U  +  C      +  P     +  0,     +  b      +   cP.  .    +  CO.,    +   Cb.      +  PO  .,  -f 

ijkm  1        J        k       m         ij  ik         im  jk 

+  Ob      +cPO..,   +cPb..    +  cOb.,     +  POb.,  +E 

km  ijk  ijm  ikm  jkm  ijkm 

where 

U  =  grand  mean 
c  =  coder 
P  =  Parent  Sex 
0  =  Occasion 

b  =  block,  where  block  equals  Infant 
E  =  experimental  error 

and  there  are 

Q  levels  of  c. 

1 

R  levels  of  P 

J 

T  levels  of  0, 

k 

N  levels  of  b 

m 
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Table  8 

Expected  Mean  Squares  and  Estimates  of  Variance 
Components  for  Each  Source  of  Variance 


Expected  Mean  Estimate  of 

Source  Square  Source  Variance  Component 
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used  in  this  study.    The  third  coefficient  was  p   (b,c,P*,0)  which 
provided  an  estimate  of  the  reliability  of  ratings  of  subjects  on 
factor  scores  of  parent- infant  interaction  when  generalization  was 
intended  to  only  one  level  of  Parent  Sex,  but  everything  else  was  as  in 
the  first  coefficient.    The  fourth  coefficient  was  p^(b,c,P,0*)  and 
provided  an  estimate  of  reliability  when  generalization  was  intended  to 
only  one  level  of  Occasion,  but  everything  else  was  as  in  the  first 
coefficient.    The  equations  for  computing  these  coefficients  are  shown 
in  Table  9. 

If  one  of  these  three  coefficients  was  not  adequate  it  was  necessary 
to  look  to  a  second-order  coefficient,  thereby  further  reducing  the 
universe  of  generalization.    There  were  again  three  coefficients.  The 
fifth  coefficient  was  (5^  (b,c*,P*,0) ,  which  provided  an  estimate  of  the 
reliability  of  factor  scores  when  generalization  was  intended  to  the 
specific  coders  used  in  this  study  and  one  level  of  Parent  Sex,  but  to 
all  subjects  and  all  levels  of  Occasion  used  in  this  analysis.  The 
fifth  was  0^(b,c*,P,O*) ,  which  provided  an  estimate  of  reliability  when 
generalizability  was  intended  to  the  specific  coders  used  in  this  study 
and  one  level  of  Occasion,  but  to  all  subjects  and  both  levels  of 
Parent  Sex.     The  seventh  coefficient  was  0^  (;b^c,P*,0*) ,  which  provided  an 
estimate  of  reliability  when  generalization  was  intended  to  only  one 
level  of  Parent  Sex  and  one  level  of  Occasion,  but  to  all  subjects  and 
all  coders.    The  equations  for  these  three  coefficients  are  also  shown 
in  Table  9. 

If  one  of  these  coefficients  was  still  not  adequate,  it  was  neces- 
sary to  look  to  the  third-order  coefficient      (b, c* ,P*,0*) ,  which  pro- 
vided an  estimate  of  the  reliability  of  factor  scores  when  generalization 
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was  intended  to  all  subjects,  but  to  the  specific  coders  used  in  this 
study,  one  level  of  Parent  Sex  and  one  level  of  Occasion.    The  equation 
for  this  coefficient  is  shown  in  Table  9. 

Subjects  nested  within  coders.    The  factor  scores  were  also  analyzed 
separately  using  a  five- facet  split-plot  design  with  each  facet  having 
two  levels  (see  Figure  3)  and  was  the  method  of  analysis  for  Objective  3. 
The  analysis  of  the  generalizability  of  factors  of  parent-infant  inter- 
action using  this  complex  design  has  the  distinct    advantage  of  investi- 
gating the  reliability  of  these  factors  in  a  sophisticated  manner  with- 
out the  additional  expense  of  double-coding  (i.e.,  each  family  coded 
twice,  once  by  each  coder)  all  families  for  all  sessions.  However, 
this  design  does  have  a  serious  flaw  in  that  variance  due  to  coder  is 
completely  confounded  with  the  variance  due  to  the  group  to  which  the 
coder  was  assigned  (Kirk,  1968).     If  there  were  significant  differences 
between  groups,  then  these  differences  could  mask  or  exaggerate  differences 
between  coders. 

Campbell  and  Stanley  (1963)  have  suggested  that  different  designs 
(each  with  different  flaws)  could  be  patched  together  such  that  the 
overall  analysis  is  less  flawed  than  any  one  design  used  by  itself. 
In  this  specific  case  this  can  be  done  by  using  information  from  the 
two  previous  analyses  wiiere  some  portion  of  the  families  were  double 
coded  at  each  session  in  order  to  get  some  idea  of  coder  effects  before 
the  more  complex  design  is  used. 

The  28  families  were  first  rank-ordered  according  to  the  average 
SES  Index  for  the  family  calculated  by  adding  the  father's  Index  to 
the  mother's  Index  and  dividing  by  two.    Taking  the  families  two  at 
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a  time,  beginning  with  the  highest  Index  (i.e.,  lowest  score),  the 
families  were  randomly  assigned  to  one  of  two  groups,  balancing  for 
sex  of  infant.    One  observer  was  then  randomly  assigned  to  code  all 
of  the  families  in  one  group.     The  Bayley  Mental  Development  Index 
(Bayley,  1969)  was  administered  to  the  infants  at  age  12  months  and 
a  t^-test  was  done  to  determine  if  there  were  significant  differences 
on  this  scale  between  the  infants  in  the  two  groups. 

The  facets  used  in  the  split-plot  design  were  coder.  Sex  of  Parent, 
Sex  of  Infant,  Age  of  Infant  and  Type  of  Task.     The  levels  for  Age  of 
Infant  were  determined  by  separating  the  sessions  into  two  groups: 
(1)  less  than  seven  months  (19  and  25  weeks)  and  (2)  over  seven  months 
(37  and  43  weeks) . 

Levels  for  Type  of  Task  were  established  through  analysis  of  a 
measure  (Task  Mastery)  from  the  Home  Scale  Observation  Instrument 
(Watts  and  Bamett,  1971),  which  was  a  measure  of  the  child's  ability 
to  correctly  perform  the  task.    An  analysis  of  the  means  for  each 
session  indicated  that  two  tasks  were  more  readily  accomplished  by  the 
infants  (x^^^  =  4.17;  x^^  =  2.94)  and  two  tasks  were  performed  less  well 
(x^^  =  .67;  x^^  =  .28).    Therefore,  the  Type  of  Task  was  divided  into 
two  levels,  one  of  high  mastery  (19  and  37  weeks)  and  a  second  of  low 
mastery  (25  and  43  weeks).     The  full  model  for  this  design  is  shown 
in  Table  10.     The  Expected  Mean  Squares  and  the  unbiased  Estimators  of 
the  Components  of  Variance  for  each  source  of  variance  are  shown  in 
Table  11.    All  variables  were  considered  fixed  except  coders  and 
subjects,  which  were  considered  random.    Again,  fixed  facets  were 
designated  by  capitalizing  the  facet  (i.e..  Parent  Sex);  random 
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Table  10 

Full  Model  and  Notation  for 
Five-facet  Split-plot  Design 


U  +  c,   +  I    +  cL,    +  sub/cl      ,  .  +  P.  +  cP,  .  +  PI .,    +  cPL  ., 

n       k  hk                m(hk)        j         hj  hjk 

'  P^/^^jm(hk)  ^  \  ^  ^\l  ^  I\l  ^  <^I\kl  *  ^^/^ll.(hk)  ^  ^0 

^  '^^ho  *  I^ko  ^  ^I^hko  *  T^/^^omChk)  ^  P^l  *              ^  PIA.kl 


+  PAs/cI..  +  PT.-  +  cPT,  .     +  PIT.,     +  cPIT,_., 

jlm(hk)         jl  hjo  jko  hjko 


*  '^'^'':o.(m  '  ^^lo  ^  ^^^hlo  *  lA^klo  *  ^^^^hklo  ^  ^^^/^^lomChk) 

*  %lo  *  ^P%lo  ^  Pl^^Jklo  ^  ^Pl^^hjklo  ^  P^^^/'^^JlomChk) 

*  ^g(hiklo) 
where 


and  there  are 


u  = 

Grand  mean 

c  = 

coder 

I  = 

Infant  Sex 

P  = 

Parent  Sex 

A  = 

Infant  Age  Grouping 

T  = 

Type  of  Task 

s  = 

subj  ect 

E  = 

Experimental  Error 

V  levels  of  c, 
r 

U  levels  of  I, 


R  levels  of  P. 

3 


0  levels  of  A 
W  levels  of  T 


1 


N  subjects 
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Table  11 

Expected  Mean  Squares  and  Estimate  of  Variance  Components 


for  Each 

Source  of 

Variance 

Source 

Expected  Mean  Square 

Source 

c 

NVURQa^  .  VU(^s(cI) 

I 

NVURWa^  +  NVUQO^j  + 

VUQa% 
^  s(cl) 

cl 

^  T 

cl 

sub/cl 

^Q^sCcI) 

sCcI) 

P 

NVURWa^  +  NVURa^„  + 
P  cP 

Ps  fcl ) 

^P 

cP 

NVURa^^  +  vual  ^  ^- 
cP          Ps  Ccl) 

cP 

PI 

Nvuiva^^  +  imo^^^  + 

PI  cPI 

Ps  fell 

^PI 

cPI 

cPI  Ps(cl) 

^CPI 

P^sub/cl 

Ps (cI) 

yv  9 

d 

PsCcIJ 

A 

NVRQWa^  +  NVRQa^  + 

^Q"as(cI) 

cA 

NVRQa^;,  -  nol^^,^ 

^cA 

lA 

NVQWa^^  +  ^Q^cIA  ^ 

^Q^L(ci) 

^lA 

cIA 

^cIA 

A^sub/cl 

^Q^isCcI) 

As (cl) 

T 

NURQWa^  +  NURPo^^  + 

"Q^sCcI) 

cT 

^'^Q^T^^Q^sCci) 

IT 

NUQWa^^  +  ^"Q^cIT 

"Q^TsCcI) 

cIT 

T^sub/cl 

"^"ts(cI) 

Ts(cl) 

PA 

NVRl'.'a2^  +  NVRa^p^  + 

^'^PAs(cI) 

CPA 

NVRa^^.  +  Vol,  r 

cPA  PAs(cI) 

«CPA 

PIA 

NVlVa^-.^  +  NVa^„^,  + 
PIA  cPIA 

PAsCcI) 

Estimate 
of  Variance  Components 


1 


NVURQ-^  ^^^c 


NVURW^  ^^^^I 


^•^sCcI)^ 


NVIIQ-^  ^    cl  sCcI)-^ 


W^^^sCcI)^ 


NVURW) t^Sp  -  MS  ^p) 


NVUR^  CP 


^Sps(cI)^ 


NVUiv)  (MSpj  -  MS^pj) 


NVU^  t^^PI 
W^f^^PsCcIJ^ 


Ps(cl)'^ 


NVRQW^ (MS 
^  ■)  (MS 


A 
cA 


M^ca) 
^'Sas(cI)) 


NVRQ 

.^i^)(MSj^-MS^j^) 


NVQ^ ^^^cIA 
VQ)f^S^s(cI)) 


MS  .    ,  ^.^) 
As(cl)^ 


NURQW^ (MS  T  -  MS 


•NURW-'  (^^cT 


"Sts(cI)) 


NUQW^  CMSj^  -  MS^j^) 


^^Sts(cI)^ 


NUQ^  ^^\lT 
■UQ)f^^Ts(cI)) 


NV-RW^  ^''^A  -  '^Pa'^ 


^'^PAs(cI)J 


NVW^  ^''^lA  -  ''^PIA^ 
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Table  11  -  continued 


Source 

Expected  Mean 

Square 

Source 

1^  D  T  A 

Cr  iA 

cPIA 

PAs(.cIJ 

2 

cPIA 

FA*SUD/Cl 

^^PAsCcI) 

*  2 

PAs(cI) 

PT 

NURWa^^  + 

NURa^p^  + 

%TstcIJ 

^PT 

cPT 

NURaJp^  . 

"''PTsCcI) 

^CPT 

PIT 

^^cPIT  ^ 

%Ts(cI) 

^PIT 

CPIT 

'^"^cPIT  ^ 

Sts(cI) 

^CPIT 

PT*sub/cI 

Sts(cI) 

PTstcIJ 

AT 

NRQWaJ^  + 

^Q^cAT  " 

^ATstcIJ 

^AT 

CAI 

NRQaJ^^  . 

^^ATs(cI) 

^CAT 

iAl 

^QW^IAT  ^ 

^Q^IAT  ^ 

/->  2 

Q^ATsCcIJ 

^lAT 

T  AT 

CiAi 

^Q^IAT  ^ 

^ATsCcI) 

^CIAT 

AT*sub/cI 

^  ATsCcIJ 

ATs (cl) 

PAT 

NRWa;^^  . 

''^^^PAT  ^ 

2 

^PATs(cI) 

^^AT 

cPAT 

^^<PAT  ^ 

PATs(cI) 

cPAT 

PI  AT 

^^cPIAT  ^ 

2 

^PATs (cl) 

PI  AT 

cPIAT 

^^CPIAT  ^ 

PATs tclj 

cPIAT 

/VT*sub/cI 

2 

^PATstcIJ 

2 

^PATs(cI) 

Estimate 
of  Variance  Components 


W^^ScPIA-^Sp^s(cI)) 
t7^f"SpAs(cI)J 


0 


NURW-'  ^^^PT 


^^^cPT 

^nDw^<-^^pit 

4^  f^S^PIT 
(^J^^^PTs(cI)J 
^NR^^  ^^^AT 
^Mq^  f^^cAT 
^"^lAT 
^HQ^  f^^clAT 
f^Jf^^ATs(cI)^ 


CPT^ 
^SpTsfcI)^ 
^  cPIt) 
^^PTsCcI)^ 


^^cAt) 
"^ATs(cI)5 
^^cIAT^ 
^^^ATsecI)) 


(■ 


NRW'^  ^^^^PAT 


(^^cPAT 
f^SpiAT 
t^S^PIAT 


^'\PAT^ 
^^PATs(cI)^ 
^^CPIAT^ 
^^^PATs  (cl)^ 


(MS 


PATs(cI) 
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facets  were  designated  through  the  use  of  all  lower  case  letters 
(i.e. ,  coder) . 

In  this  analysis  eight  different  generalizability  coefficients 
were  determined  for  each  factor.    The  first  coefficient  was  p^(s,c,I, 
P,A,T)  which  provided  an  estimate  of  the  reliability  of  ratings  of 
subjects  on  factor  scores  of  parent-infant  interaction  when  generaliza- 
tion was  intended  for  all  coders,  both  levels  of  Infant  Sex,  both 
levels  of  Parent  Sex,  the  two  Age  Groupings  of  infants  considered  in 
this  study  and  the  two  levels  of  Tasks  considered.    The  equation  for 

computing  the  coefficient  is  shown  in  Table  12. 

2 

The  second  coefficient  was  p  (s,c, I,P*,A,T)  which  provided  an 
estimate  of  the  reliability  of  ratings  of  subjects  on  factor  scores  of 
parent-infant  interaction  when  generalization  was  intended  to  only  one 
level  of  Parent  Sex,  but  everything  else  remained  the  same.  The 
equation  for  computing  this  coefficient  is  also  shown  in  Table  12. 

2 

The  third  coefficient  was  p  (s,c,I,P,  A*,T)  which  provided  an 
estimate  of  ratings  of  subjects  when  generalization  was  intended  to 
only  one  level  of  infant  Age  Grouping,  but  everything  was  as  in  the 
first  coefficient.     Again,  the  equation  for  computing  this  coefficient 
is  sho™  in  Table  12. 

The  fourth  coefficient  was  p  (s, c, I,P,A,T*)  which  provided  an 
estimate  of  the  reliability  of  ratings  of  subjects  when  generalization 
was  intended  to  only  one  level  of  Task,  but  again,  everything  was  as 
in  the  first  equation.    The  equation  for  computing  this  coefficient  is 
also  shown  in  Table  12. 

If  the  first  generalizability  coefficient  was  less  than  .70,  it 
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was  necessary  to  look  at  the  next  three  coefficients,  each  of  which 
represented  a  first-order  interaction  with  ratings  of  subjects.  If 
each  of  these  coefficients  was  less  than  .70  it  meant  that  one  could 
not  generalize  subjects'  scores  when  the  level  of  only  one  facet  was 
held  constant  and,  therefore,  one  must  look  to  a  second-order  coefficient. 
These  are  shown  in  Table  12. 

Again,  it  must  be  emphasized  that  the  general izability  coefficient 
is  defined  as  the  ratio  of  a  universe  score  variance  to  observed  score 
variance;  the  generalizability  coefficient  should  become  larger  as  more 
of  the  variance  was  considered  true  score  variance  rather  than  error 
variance.    However,  the  universe  to  which  that  score  would  generalize 
becomes  increasingly  smaller.     If  the  third-order  coefficient  0^(s,c,I, 
P*,A*,T*)  for  a  factor  was  not  above  .70,  there  were  four  alternative 
explanations: 

1)  the  coder  facet  and/or  the  infant  sex  facet  was  having  some 
systematic  influence  on  the  scores; 

2)  these  factor  scores  varied  nonsystematically; 

3)  these  factor  scores  varied  systematically  on  some  facets  not 
considered  in  this  study; 

4)  some  combination  of  alternatives  1-3. 

Summary 

Twenty-eight  white,  first-born  infants  were  video-taped  interacting 
with  their  mothers  and  fathers  on  four  separate  occasions.     Two  observers 
coded  the  parent-infant  interaction  using  the  Reciprocal  Category 
System  (RCS)  from  which  32  measures  were  derived  which  accounted    '  ; 
for  84%  of  the  total  tallies.     These  measures  were  factor  analyzed 
and  five  factor  scores  were  produced. 
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The  intercoder  agreement  on  these  factor  scores  for  three  of  the 
four  occasions  was  analyzed  using  a  randomized  block  factorial  design 
using  the  facets  of  coder  and  Parent  Sex.    The  reliability  of  these 
factor  scores  was  analyzed  using  two  additional  designs.    The  first  of 
these  was  also  a  randomized  block  factorial  design  using  the  facets  of 
coder.  Parent  Sex  and  Occasion.    The  second  was  a  split-plot  design 
where  the  28  families  were  divided  into  two  separate  groups  of  14  and 
one  observer  was  assigned  to  each  group.    The  facets  were  coder.  Infant 
Sex,  Parent  Sex,  Age  of  Infant  and  Type  of  Task. 


CHAPTER  IV 


ANALYSIS 

The  purpose  of  this  study  was  to  investigate  the  reliability  of 
parent-infant  interaction  through  the  use  of  generalizability  theory. 
Video-tapes  of  parent- infant  interaction  were  coded  by  two  observers 
using  the  Reciprocal  Category  System  (RCS) ,    Factor  scores  were 
produced  and  analyzed  using  three  designs.    The  first  design  was  used 
to  investigate  intercoder  agreement;  the  second  and  third  designs  were 
used  to  investigate  reliability.     In  one  of  these  designs  subjects  were 
crossed  with  coders;  in  the  other,  subjects  were  nested  within  coders. 

The  factor  analysis,  variance  components  and  generalizability 
coefficients  are  reported  in  this  chapter. 

Development  of  Factor  Scores 
The  32  measures  from  the  RCS  were  factor  analyzed  using  a  principal 
axis  solution.     Based  upon  a  scree    test,  the  five  factors  were  rotated 
to  the  Varimax  criteria  and  the  resulting  solution  accounted  for  75% 
of  the  common  variance  (see  Tables  13-17).    All  of  the  loadings  were 
positive  and  above  .40.    Only  one  variable  (14)  loaded  on  two  factors 
(Factors  1  and  3).    One  interesting  finding  was  that  for  the  most  part  it 
was  the  babies'  behavior  which  determined  the  factor  structure.  The 
parent  behaviors  of  accepting,  amplifying,  eliciting,  initiating  and 
directing  all  loaded  on  the  same  factor  dependent  upon  the  baby  behavior 
in  the  interaction  sequence. 

Factor  1  (parent- infant  interaction;  baby  amplifies)  was  composed 
of  both  interaction  measures  and  baby  behavior  measures  which  involved 
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Table  13 

RCS  Factor  1 
Parent-Infant  Interaction;  Baby  Amplifies 


Item 

Number  Description  Loading 

12  Baby  amplifies;  parent  elicits,  initiates,  .88 

directs 

03;  14,16,17 

30  Parent  elicits,  initiates,  directs;  baby  .83 

amplifies 

14,16,17;  03 

26  Parent  accepts,  amplifies;  baby  amplifies  ,68 

12,13;  03 

11  Baby  amplifies;  parent  accepts,  amplifies  .66 

03;  12,13 

14  Baby  responds;  baby  amplifies  .64 

05;  03 

8  Baby  amplifies;  baby  amplifies  .58 

03;  03 

7  Baby  amplifies;  baby  warms,  accepts  .51 

03;  01,02 

9  Baby  amplifies;  baby  responds  .44 

03;  05 

10  Baby  amplifies;  baby  initiates  .44 

03;  06 

2  Baby  warms,  accepts;  baby  amplifies  .43 

01,02;  03 


Eigen  Value  =  3.41 


56 


Table  14 

RCS  Factor  2 
Parent-Infant  Interaction;  Baby  Responds 


Item 

IN  LilliUv::'! 

Description 

Loading 

17 

Baby  responds;  parent  accepts,  amplifies 
05;  12,13 

.87 

27 

Parent  accepts,  amplifies;  baby  responds 
12,13;  05 

87 

18 

Baby  responds;  parent  elicits,  initiates, 
directs 

05;  14,16,17 

.80 

31 

Parent  elicits,  initiates,  directs;  baby 
responds 

14,16,17;  05 

.80 

Eigen  Value  =3.86 
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Table  15 

RCS  Factor  3 
General  Baby  Behavior 


Item 

Number  Description  Loading 

3  Baby  warms,  accepts;  baby  responds  ,67 

01,02;  05 

16  Baby  responds;  baby  inititates  .65 

05;  06 

4  Baby  warms,  accepts;  baby  initiates  .59 

01,02;  06 

19  Baby  initiates;  baby  warms,  accepts  .59 

06;  01,02 

13  Baby  responds;  baby  warms,  accepts  .55 

05;  01,02 

21  Baby  initiates;  baby  responds  51 

06;  05 

1  Baby  warms,  accepts;  baby  warms,  accepts  .44 

01,02;  01,02 

15  Baby  responds;  baby  responds  .43 

05;  05 

14  Baby  responds;  baby  amplifies  41 

05;  03 


Eigen  Value  =4.29 
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Table  16 

RCS  Factor  4 
Parent-Infant  Interaction;  Baby  Warms,  Accepts 


Item 

Number  Description  Loading 

6  Baby  warms,  accepts;  parent  elicits,  .86 

initiates,  directs 
01,02;  14,16,17 

29  Parent  elicits,  initiates,  directs;  baby  .84 

warms,  accepts 
14,16,17;  01,02 

25  Parent  accepts,  amplifies;  baby  warms,  .67 

accepts 

12,13;  01,02 

5  Baby  warms,  accepts;  parent  accepts,  .61 

amplifies 

01,02;  12,13 


Eigen  Value  =  2.73 
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Table  17 

RCS  Factor  5 
Parent-Infant  Interaction;  Baby  Initiates 


Item 

Number  Description  Loading 

32  Parent  elicits,  initiates,  directs;  ,78 

baby  initiates 
14,16,17;  06 

24  Baby  initiates;  parent  elicits,  ,77 

initiates,  directs 
06;  14,16,17 

28  Parent  accepts,  amplifies;  baby  ,76 

initiates 
12,13;  06 

23  Baby  initiates;  parent  accepts,  .72 

amplifies 
06;  12,13 


Eigen  Value  =2.98 
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the  baby  amplifying.  This  was  a  measure  of  the  extent  to  which  the 
baby  was  exhibiting  extending  behavior  after  the  task  was  presented 
by  the  parent. 

Factor  2  (parent -infant  interaction;  baby  responds)  was  composed 
only  of  interaction  measures  and  was  an  indication  of  the  extent  to 
which  the  baby  was  responding  while  being  verbally  stimulated  or 
responded  to  by  the  parent.    From  an  informal  analysis  of  the  video- 
tapes it  was  apparent  that  the  majority  of  these  responses  were  on- 
task  behaviors  (i.e.,  the  babies  were  behaving  in  a  manner  appropriate 
to  the  structured  teaching  situation) . 

Factor  3  (general  baby  behavior)  was  composed  exclusively  of 
measures  of  baby  behavior  and  was  a  measure  of  general  level  of  baby 
activity.     It  was  also  some  indication  that  parent  verbal  behavior 
was  absent.    All  of  the  baby  behaviors  that  were  used  in  this  analysis 
were  represented  in  this  factor  so  that  it  was  not  possible  to  be  more 
specific . 

Factor  4  (parent -infant  interaction;  baby  warms,  accepts)  was 
also  composed  only  of  interaction  measures  and  was  an  indication  of 
the  extent  to  which  the  baby  was  either  warming  or  accepting  while  the 
parent  was  exhibiting  verbal  behavior.    These  baby  behaviors  were 
generally  affective  and  passive  as  compared  to  the  more  active  behaviors 
of  responding  or  initiating,  although  the  majority  were  on-task 
behaviors . 

Factor  5  (parent-infant  interaction;  baby  initiates)  was  the  third 
factor  which  was  composed  exclusively  of  interaction  measures.     It  was 
an  indication  of  the  extent  to  which  the  baby  was  emitting  behavior 
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which  had  no  immediate  observable  antecedent  while  the  parent  was 
verbally  engaging  the  infant.     It  was  generally  active,  off-task 
behavior  in  that  the  baby  was  doing  something  other  than  the  behavior 
requested  in  the  structured  teaching  situation. 

Analysis  of  Intercoder  Agreement 
The  intercoder  agreement  for  each  of  the  five  factors  was  analyzed 
at  three  separate  occasions  (25  weeks,  37  weeks,  43  weeks)  using 
randomized  block  factorial  designs.    The  total  sample  of  the  longitudinal 
project  (n  =  38)  had  a  mean  Bayley  Mental  Development  Quotient  (MDQ)  of 
118.7  with  a  standard  deviation  of  12.0  (standardized  sample  mean  =  100 
with  standard  deviation  of  15).    The  infants  used  in  the  25  weeks  analysis 
had  a  mean  MDQ  of  116.6  and  a  standard  deviation  of  14.3;  at  37  weeks  the 
mean  MDQ  was  115.2  with  a  standard  deviation  of  13.8;  at  43  weeks  the 
mean  MDQ  was  112.6  with  a  standard  deviation  of  13.1.    The  components 
of  variance  and  generalizability  coefficients  are  shown  for  each  factor 
in  Tables  18-22.     Coder  and  block  (block  =  infant)  were  considered  random; 
Parent  Sex  was  considered  fixed.    A  summary  of  the  generalizability 
coefficients  is  shown  in  Table  23.     These  coefficients  were  not  an 
estimate  of  reliability  because  variance  due  to  occasion  was  not  con- 
sidered.   A  coefficient  above  .70  was  considered  an  indication  of  adequate 
intercoder  agreement. 
Estimates  of  Variance  Components 

Notice  that  there  are  negative  estimates  for  each  factor,  which 
are  especially  large  for  the  Parent  Sex  x  block  interaction  for  Factor 
2  at  37  and  43  weeks  and  for  block  for  Factor  4  at  43  weeks.    As  observed 
by  Llabre  (1978),  these  are  obviously  poor  estimates  since  a  variance  is, 
by  definition,  non-negative.     It  is  interesting  that  the  majority 
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involve  the  facet  of  Parent  Sex;  in  only  three  instances  out  of  15  is 
the  estimate  of  variance  for  the  main  effect  of  Parent  Sex  above  1.00. 

The  largest  estimated  variance  component  for  Factor  1  (Parent- 
infant  interaction  with  baby  amplifying)  at  25  weeks  was  for  the  coder 

2 

X  block  interaction  (a  ,     =  6.34):  at  37  weeks  it  was  the  Parent  Sex 

cb 

2  2 

X  block  interaction  followed  by  block  (Op^    =  8.56,  5^    =  7.85);  at  43 
weeks  it  was  again  coder  x  block  (a^^^  =  13.68).    Notice,  however,  that 
at  25  weeks  the  estimated  variance  component  for  coder  was  one  of  the 
largest  while  at  37  and  43  weeks  it  was  negative .    This  means  that  at 
25  weeks  there  were  differences  in  the  overall  means  between  the  two 
coders  but  these  were  negligible  at  37  and  43  weeks.     In  all  three 
instances,  though, the  estimated  variance  component  for  coder  x  block 
was  one  of  the  largest.     This  means  that  the  coders  were  assigning 
different  rankings  to  subjects  irrespective  of  their  overall  ratings  fo 
all  subjects.     The  former    component  was  part  of  the  observed  score 
variance,  while  the  latter  was  not  (see  Table  6). 

At  25  weeks  the  largest  estimated  variance  component  for  Factors 

3  and  5  was  the  Parent  Sex  x  block  interaction  and  for  Factor  2  and 

2  2  2 

4  It  was  the  residual   (5p^    =  9.44,  d^^    =  28.99,  a  residual  =  35.47, 

2 

a    residual  =  25.18,  respectively). 

At  37  weeks  the  largest  estimated  variance  component  for  Factors 

2  2  2 

2,  3,  4  and  5  was  for  block  (a,      18.50,  =  13.05,  =  29.21. 

D  b  D 

o^=  24.70,  respectively). 

At  43  weeks  the  largest  estimated  variance  component  for  Factors 

2 

3,  4  and  5  was  for  the  coder  x  block  interaction  (a  ,  =11.04, 

°cb    ~  -53.40,  a^,^    =  24.04,  respectively).     The  largest  component  for 

Factor  2  was  coder  (a    =  15.57). 

c 
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Generalizability  Coefficients 

With  the  large  estimates  of  variance  components  for  the  coder  x 
block  interaction  and  the  residual  at  25  and  43  weeks,  one  would  not 
expect  the  generalizability  coefficients  which  estimated  the  intercoder 
agreement  when  generalization  was  intended  to  all  subjects  and  coders 
and  both  levels  of  Parent  Sex  to  be  very  high  at  those  occasions,  and 
such  was  the  case.    These  coefficients  are  shown  in  Table  23  and  were 
derived  by  substituting  the  estimates  of  variance  components  from  Tables 
18-22  into  the  equations  shown  in  Table  6.     Sixty  different  coefficients 
were  generated,  four  for  each  factor  at  each  occasion. 

The  first  coefficient  at  each  occasion  was  p^(s,c,P)  and  was  an 
estimate  of  the  intercoder  agreement  on  a  factor  score  of  parent -infant 
interaction  when  generalization  was  intended  over  all  subjects  and 
coders  and  to  both  levels  of  Parent  Sex.    The  only  coefficients  that  are 
above  .70  are  for  Factors  2,  3,  4  and  5  at  37  weeks.     It  was  therefore 
necessary  to  look  to  the  first -order  interaction  of  coder  or  Parent 
Sex  X  block  in  an  attempt  to  increase  the  estimate  of  intercoder 
agreement  for  each  factor. 

At  25  weeks  the  coefficient  p^(s,c,P*)  for  Factor  5  was  .78;  at 
37  weeks  it  was  .75  for  Factor  1.    This  was  an  estimate  of  generaliza- 
bility when  the  universe  of  generalization  was  to  all  coders  and  subjects, 
but  to  only  one  level  of  Parent  Sex.     Notice  that  in  order  for  these  two 
coefficients  to  become  larger  than  .70,  the  universe  of  generalization 
has  become  smaller.    One  could  also  obtain  larger  coefficients  for 
Factors  2,  3,  4  and  5  at  37  weeks  if  one  accepted  a  smaller  universe 
of  generalization. 
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2 

At  43  weeks  the  coefficient  p  (s,c*,P)  becomes  larger  than  .70  for 
all  five  factors  (.84,   .83,   .77,   .91,   .78,  respectively).    This  was  an 
estimate  of  generalizability  when  the  universe  of  generalization  was  to 
all  subjects  and  both  levels  of  Parent  Sex  but  to  the  specific  coders 
used  in  this  study. 

The  coefficients  for  Factors  1  and  3  at  25  weeks  did  not  become 
larger  than  .70  until  the  universe  of  generalization  was  restricted  to 
all  subjects,  to  the  specific  coders  used  in  this  study  and  one  level 
of  Parent  Sex  (-82  and  .93,  respectively);  the  coefficients  for  Factor  2 
and  4  never  obtained  that  value  (.68  for  both). 

Reliability  Analysis 
Subjects  Crossed  With  Coders 

Estimates  of  Variance  Components.    Each  of  the  factors  was  again 
analyzed  separately  using  a  randomzied  block  factorial  design.  The 
infants  used  in  this  analysis  had  a  mean  MDQ  of  114.9  and  a  standard 
deviation  of  11.8.    The  components  of  variance  for  each  source  of 
variance  for  each  factor  are  shown  in  Tables  24-28.    Again,  notice 
that  there  are  negative  estimates  for  each  factor,  which  are  especially 
large  for  Factors  2,  4,  and  5.    Again,  a  majority  of  these  negative 
estimates  involve  the  facet  of  Parent  Sex.     In  addition.  Factor  4  has 
a  negative  estimate  for  subjects  and  Factors  4  and  5  have  negative 
estimates  for  coders. 

The  large  estimated  variance  component  for  Factor  1  (parent- 

2 

infant  interaction  with  baby  amplifying)  was  for  Occasion  (a^    =  5.10), 

2 

followed  by  the  Occasion  by  subject  interaction  (d^^  =  4.31,  see  Table  24) 
Notice  that  the  estimated  variance  components  for  coders  and  subjects  x 
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2  2 

coders  interaction  were  relatively  small  (a^    =  .31,  5^^    =  .62).  The 
interpretation  would  be  that  the  ratings  of  subjects  by  the  two  coders 
were  quite  similar  for  each  occasion,  but  the  scores  and  relative 
rankings  did  change  from  occasion  to  occasion.     While  it  is  not  proper 
to  test  for  significant  differences  with  estimates  of  variance  components 
(Cronbach  et  al . ,  1972),  it  is  obvious  that  the  facet  Occasion  explains 
much  more  variance  than  does  the  facet  coder.     This  variance  does  not 
affect  the  generalizability  coefficient,  however,  since  all  subjects 
were  seen  by  both  coders,  with  both  parents  and  on  all  three  occasions 
and,  therefore,  the  variance  component  for  these  three  main  effects  was 
not  included  as  part  of  the  observed  variance  (see  Table  9) . 

The  largest  estimated  variance  component  for  Factor  2  (parent-infant 
interaction  with  infant  responding)  was  for  the  residual  followed  by 

2  2 

the  Occasion  x  subject  interaction  (d      .  .    ,     =  25.25,  i3_      =  20.06, 
■'  ^  residual  Os 

see  Table  25).    Notice  that  there  are  three  negative  estimates,  two  of 

2  2  2 

which  are  larger  than  -1.00  (d^      =  -.77,  a         =  -3.51,  =  -5.70). 

^  ^  Ps  cPO  POs 

Cronbach  et  al .   (1972)  suggest  that  in  cases  such  as  this  an  incorrect 
model  may  be  the  problem.    For  example.  Parent  Sex  is  represented  in 
each  of  the  six  negative  estimates  for  Factors  1  and  2.     Parent  Sex  is 
also  represented  in  three  of  the  four  negative  estimates  for  Factor  3, 
and  four  of  the  six  negative  estimates  for  Factor  4.     Perhaps  a  reanalysis 
of  the  data  using  only  the  facets  of  coder  and  Occasion  would  prove  to 
be  a  better  model  (i.e.,  negative  estimates  would  be  a  smaller  percentage 
of  total  number  of  estimates  and/or  reduced  in  value) .     It  is  also  true 
that  the  small  number  of  subjects  could  be  influencing  the  situation. 
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The  largest  estimated  variance  component  for  Factor  3  (general 

2 

baby  behavior)  was  again  Occasion  (5^    =  7.86)  followed  by  the  Parent 

2 

Sex  X  subject  interaction  (dp^    =  7.17,  see  Table  26).    Notice  that 
this  latter  is  the  case  even  though  the  estimated  variance  component 

2 

for  the  main  effect  of  Parent  Sex  is  negative  (dp    =  -.69).    This  means 
that  even  though  the  differences  between  parents  were  negligible,  the 
rankings  of  subjects  changed  depending  upon  which  parent  was  interacting 
with  the  infant  and  seems  to  provide  evidence  that  Parent  Sex  should 
not  be  dropped  from  the  model,  at  least  for  this  factor.    However,  if 
Parent  Sex  was  dropped  from  the  model,  the  block  would  become  equal  to 
a  parent-infant  pair  so  that  it  would  be  necessary  to  be  specific  about 
which  parent  was  interacting  with  the  infant. 

The  largest  estimated  variance  component  for  Factor  4  (parent -infant 
interaction  with  baby  warming  and/or  amplifying)  was  again  for  Occasion 

2 

((3q    =  29.31,  see  Table  27).    Notice  that  the  main  effects  for  coder. 
Parent  Sex  and  subjects  were  each  negative.     This  is  especially  critical 
in  the  case  of  subjects,  since  differences  between  subjects  is  a 
necessity  for  reliable  data  (Medley  and  Mitzel,  1963). 

The  largest  estimated  variance  component  for  Factor  5  (parent- 
infant  interaction  with  baby  initiating) was  for  the  Occasion  x  subject 

2  2 

interaction  followed  bv  Occasion  (a_      =  29.75,         =  18.19,  see  Table  28). 

^  Os  0  ' 

The  two  negative  estimates  were  for  the  main  effect  of  coder  and  the 

2  2 

coder  x  Occasion  interaction  (5      =  -1.15,  a  _    =  -1.19). 

c  cO  ' 

Generalizability  Coefficients.    The  generalizability  coefficients 
estimating  the  reliability  of  each  of  the  five  factors  of  parent-infant 
interaction  when  subjects  were  crossed  with  coders  are  shown  in  Table  29. 
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The  coefficients  were  derived  by  substituting  the  estimates  of  variance 
components  from  Tables  24-28  into  the  equations  shown  in  Table  9.  Forty- 
different  coefficients  were  estimated,  eight  for  each  factor. 

The  first  coefficient  was  0^(s,c,P,O)  and  was  an  estimate  of  the 
reliability  of  a  factor  score  when  generalization  was  intended  over  all 
subjects  and  all  coders,  both  levels  of  Parent  Sex  and  three  levels  of 
Occasion.    All  of  these  coefficients  were  relatively  low,  with  the 
coefficients  for  Factors  3,  4  and  5  being  especially  so.     It  was  there- 
fore necessary  to  look  to  a  first-order  interaction  in  an  attempt  to 
increase  the  estimate  of  reliability  for  each  of  the  factors. 

The  next  three  coefficients  were      (s,c*,P,0) ,        (;s,c,P*,0)  and 
02(s,c,P,O*)  which  estimated  the  reliability  of  ratings  when  generaliza- 
tion was  intended  over  all  subjects  but  to  only  the  specific  coders  in  this 
study,  a  specific  level  of  Parent  Sex  or  a  specific  level  of  Occasion, 
respectively.    Notice  that,  with  only  exception,  the  coefficients  are  less 
than  .55.    The  exception  is  for  Factor  2  where  the  coefficient  Cs,c,P,0*) 
is  .70.    For  the  other  factors,  then,  it  was  necessary  to  look  to  a  second- 
order  interaction  in  order  to  increase  the  reliability  estimate. 

There  were  again  three  coefficients  which  represented  second-order 
interactions.    For  Factors  1,  3  and  5  the  coefficient  for  p^(s , c,P*, 0*) 
was  larger  than  .70  (.79,   .81,   .81,  respectively).    However,  for  Factor  4 
none  of  the  coefficients  exceed  .70  and,  therefore,  it  was  necessary  to 
look  to  the  third-order  coefficient. 

This  last  coefficient  was  ^^(s,c*,P*,0*)  which  was  an  estimate  of 
reliability  of  ratings  when  generalization  was  intended  for  all  subjects, 
but  to  the  specific  coders  used  in  this  study,  and  to  a  specific  level 


80 


of  both  Parent  Sex  and  Occasion.     Factor  4  had  a  value  of  .91  for  this 
coefficient. 

Subjects  Nested  Within  Coders 

.  Estimates  of  Variance  Components.    Each  of  the  factors  was  again 
analyzed  separately  using  a  five-facet  split-plot  design.    The  mean 
MDQ  for  Group  1  was  122.6  with  a  standard  deviation  of  10.7;  for 
Group  2  the  mean  was  115.1  with  a  standard  deviation  of  13.5.    While  this 
difference  was  not  statistically  significant  (t_  =  1.88,  ^  26  ~  2.05), 

it  was  approxiamtely  1/2  standard  deviation.    The  components  of  variance 
for  each  source  of  variance  for  each  factor  are  shown  in  Tables  30-34. 
Again,  notice  the  large  number  of  negative  estimates  for  each  factor. 
It  is  interesting  that  the  majority  of  these  negative  estimates  are 
interactions  involving  either  coder  or  infant  sex. 

The  largest  estimated  variance  component  for  Factor  1  (Parent- 

2 

infant  interaction  with  baby  amplifying)  was  for  the  residual  fa      .  ,    ,  = 

residual 

13.95).     Since  there  was  only  one  occasion  per  cell,  the  experimental  error 
was  confounded  with  this  term  and  could  either  magnify  or  conceal  actual 
differences  between  coders  since  coders  were  nested  within  groups.  For 
example,  the  mean  for  Group  2  was  lower  than  Group  1.     If  coder  2  were 
actually  assigning  lower  scores  on  a  particular  factor  than  coder  1,  then 
these  differences  would  be  magnified  by  the  differences  between  groups. 
However,  if  coder  2  were  assigning  higher  scores,  then  the  actual  dif- 
ferences would  be  masked. 

All  of  the  sources  of  variance  for  Factor  1  involving  subjects  were 
substantial.    This  means  that  there  were  initial  differences  between 
ratings  of  subjects,  but  these  ratings  changed  with  respect  to  ranking 
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of  subjects  dependent  upon  the  level  of  facet  or  facets  which  were 
included  in  the  study.    This  does  not  necessarily  mean  there  were 
differences  betiveen  levels  of  the  facet,  but  it  does  not  preclude  dif- 
ferences either.    For  example,  the  estimate  of  the  variance  component 

for  Type  of  Task  x  subj ect (coder  x  Infant  Sex)  was  almost  equal  to  that  for 

2  2 

Age  of  Infant  x  subject(coder  x  Infant  Sex)   C^jgj-^-j-)     "  6.18,  ^y^^^^j-j  " 

2 

6.61).     Yet  the  estimate  for  Type  of  Task  (o^    =  -.18)  indicated  no 

differences  between  task  levels,  while  the  estimate  for  Age  of  Infant 

2 

(d^    =  7.93)  indicated  substantial  differences  between  infant  age  levels. 
This  latter  variance  does  not  affect  the  generalizability  coefficient, 
however,  since  all  subjects  were  seen  at  both  ages  and,  therefore,  the 
variance  component  for  Infant  Age  Grouping  was  not  included  as  part  of 
the  observed  score  variance  (see  Table  12). 

The  estimates  of  variance  components  for  Factor  2  (Parent -infant 
interaction  with  baby  responding)  showed  a  very  similar  pattern.  Again, 
the  largest  estimate  is  for  the  residual   (residual  =  24.93)  followed  by 
the  Infant  Age  Grouping  x  Type  of  Task  x  subj ects (coder  x  Infant  Sex) 
interaction  (o^-j-g (-^j-j  =  18.83).    Also,  all  of  the  estimates  involving 
subjects  each  accounted  for  at  least  5%  of  the  total  sums  of  squares. 

For  Factor  3  (General  baby  behavior)  the  largest  estimate  of 

2 

variance  component  was  for  Infant  Age  Grouping  (d^    =  9.18).    The  second 

and  third  largest  were  for  the  residual   (a      . ,  =  8.51),  and  the 

residual 

Infant  Age  Grouping  x  Type  of  Task  x  subj ects (coder  x  Infant  Sex) 

2 

interaction  (^jg^-^j^     =  7.01).     Again,  with  only  two  exceptions,  each  of  the 
sources  of  variance  involving  subjects  accounted  for  at  least  5%  of  the 
total  sums  of  squares. 
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The  estimates  of  variance  components  for  Factor  4  (Parent -infant 
interaction  with  baby  warming  or  accepting)  repeated  the,  by  now  famil- 

2  2 

iar,  pattern  (&  =  32.50,  &. „^  ,         =  22.66).    However,  for 

'  ^  ^  residual  ATs(cI) 

Factor  4  the  estimate  for  the  main  effect  of  coder  accounted  for  16% 

2 

of  the  total  sums  of  squares  (a^    =  19.18).     It  should  be  emphasized 
that  in  the  split -plot  design  variance  due  to  differences  between  coders 
and  variance  due  to  differences  between  groups  to  which  coders  were 
assigned  was  completely  confounded,  so  that  it  is  possible  that  this 
estimate  was  overestimating  differences  between  coders. 

For  Factor  5  (Parent -infant  interaction  with  baby  initiating)  the 

largest  estimate  of  variance  component  was  for  the  main  effect  of  Type 

2 

of  Task  (a^    =  45.96).    The  next  three  largest  were  for  the  residual, 
the  Infant  Age  Grouping  x  Type  of  Task  x  subj ect (coder  x  Infant  Sex) 
interaction  and  the  Parent  Sex  x  Infant  Grouping  x  subjects (coders  x 

2  2  2 

Infant  Sex)  interaction  (d      .  ,    ,     24.19,  a,_,      =  21.13,  ,  = 

residual  ATs  PAs(cI) 

21.44)  . 

To  summarize  the  information  about  the  estimates  of  variance 
components,  it  was  noted  that  there  w  as  a  large  number  of  negative 
estimates  which  were  obviously  poor  estimates  since  a  variance  is,  by 
definition,  non-negative.    The  largest  estimates  were  consistently  those 
associated  with  the  residual  and  the  Infant  Age  Grouping  x  Type  of  Task  x 
subjects (coders  x  Infant  Sex)  interaction. 

Generalizability  Coefficients.     The  generalizability  coefficients 
estimating  the  reliability  of  each  of  the  five  factors  of  parent-infant 
interaction  are  shown  in  Table  35.     The  coefficients  were  derived  by 
substituting  the  estimates  of  variance  components  from  Tables  30-34  into 
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the  equations  shown  in  Table  12.    Forty  different  coefficients  were 

estimated,  eight  for  each  factor. 

2 

The  first  coefficient  was  p  (s, c, I ,P,A,T)  and  was  an  estimate  of 

the  reliability  of  a  factor  when  generalization  was  intended  over  all 

of  the  facets  used  in  this  study.    All  of  these  were  relatively  low, 

the  largest  being  .34  for  Factor  3.     It  was  therefore  necessary  to  look 

to  the  first-order  interactions  of  facets  by  subjects  in  an  attempt  to 

increase  the  reliability  estimates  for  each  of  the  factors.    This  was 

done  at  the  expense  of  generalizing  to  a  smaller  universe. 

2 

The  next  three  generalizability  coefficients  were  |S  (s,  c,  I  ,P*,A,T) , 

2  2 
p  (s,c,I,P,A*,T)  and  p  (s,c, I,P,A,T*) ,  which  estimated  the  reliability  of 

ratings  when  generalization  was  not  intended  to  levels  of  Parent  Sex, 

Infant  Age  Grouping  or  Type  of  Task,  respectively.    None  was  higher 

than  .50,  the  largest  being  .48  for  Factor  3.     If  increased  reliability 

were  desired  one  could  further  reduce  the  universe  to  which  generalization 

would  be  made  by  looking  at  the  second-order  interactions  of  facets  by 

subjects . 

Accordingly,  the  next  three  generalizability  coefficients  were 

2  2  2 

p  (s,c,I,P*,A*,T),  p  (s,c,I,P*,A,T*)  and  p  (s,c, I,P,A*,T*)  which  estimated 
the  reliability  of  ratings  when  generalization  was  not  intended  to  levels 
of  Parent  Sex  or  Infant  Age  Grouping,  Parent  Sex  or  Type  of  Task,  and 
Infant  Age  Grouping  or  Type  of  Task,  respectively.     For  example,  if  one 
were  concerned  with  the  reliability  of  each  factor  for  a  given  taping 

2 

session  one  would  look  at  the  generalizability  coefficient  p  (s , c, I ,P,A* ,T*) 
which  was  .62  for  Factor  1,   .58  for  Factor  2,  etc. 
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If  still  increased  reliability  were  desired  one  could  look  at  the 

2 

last  coefficient  generated  (p  (s, c, 1,P*,A*,T*)),  which  estimated  the 
reliability  of  ratings  when  generalization  was  only  intended  for  subjects, 
coders  and  Infant  Sex.    The  estimate  of  this  coefficient  was  conservative 
since  the  variance  due  to  PATs(cI)  was  confounded  with  experimental  error 
and,  therefore,  not  included  as  part  of  the  estimate  of  the  universe 
score.    Nevertheless,  the  coefficients  ranged  from  a  low  of  .75  and  .74 
(for  Factors  1  and  4,  respectively)  to  a  high  of  .88  (for  Factor  5), 
which  would  seem  to  be  adequate  for  most  decision  studies. 

Additional  Analyses 

Intercoder  Agreement 

Since  a  large  number  of  the  estimates  of  variance  components  for 
Parent  Sex  in  the  original  analysis  of  intercoder  agreement  were  nega- 
tive, a  second  analysis  was  done  which  eliminated  that  facet.    The  full 
model  for  the  randomized  block  design  used  in  this  analysis  is  the  same 
as  that  shown  in  Table  4  with  P^  removed.    Again,  coders  and  blocks 
were  considered  random;  a  block  in  this  model  indicates  a  parent-infant 
pair.    The  algorithms  for  computing  the  Expected  Mean  Squares  and  the 
Estimates  of  Variance  Components  are  the  same  as  shown  in  Table  8  with 
all  reference  to  Parent  Sex  likewise  removed;  in  this  case  the  coder  x 
block  interaction  (cb)  is  the  residual. 

The  components  of  variance  for  each  factor  are  shown  in  Tables  36-40. 
Using  this  model  there  were  only  five  negative  estimates  (four  involving 
the  main  effect  of  coder)  and  only  one  was  above  1.0.    For  all  factors 
at  37  weeks  the  largest  estimated  variance  component  was  for  block. 
However,  at  25  and  43  weeks  the  largest  estimated  variance  component 
for  nine  of  ten  cases  was  for  the  coder  x  block  interaction. 
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The  intercoder  agreement  for  ratings  of  factor  scores  of  parent- 
infant  interaction  for  each  session  was  evaluated  through  the  calculation 
of  a  generalizability  coefficient  which  is  shown  in  Table  41.  Again, 
this  coefficient  was  not  an  estimate  of  reliability  because  variance 
due  to  occasion  was  not  considered.    Notice  that  by  eliminating  the 
facet  of  Parent  Sex  from  the  analysis,  we  are  now  only  able  to  generate 

one  generalizability  coefficient.    This  coefficient  is  approximately 

2 

equal  to  the  coefficient  p  (b,c,P*)  shown  in  Table  23.    The  reason 
that  only  one  coefficient  can  be  generated  is  that  the  variance  due  to 
the  coder  x  block  interaction  is  now  confounded  with  that  due  to  ex- 
perimental error;  also,  the  variance  due  to  the  Parent  Sex  x  block  inter- 
action is  now  a  part  of  the  variance  due  to  block. 

The  generalizability  coefficient  for  Factor  1  (parent-infant 
interaction  with  baby  amplifying)  at  25  weeks  was  .28.     This  increased 
to  .76  at  37  weeks  and  then  declined  to  .26  at  43  weeks.    An  examination 
of  the  components  of  variance  indicates  that  the  problem  at  25  weeks  was 
largely  a  result  of  little  variance  between  blocks,  while  at  43  weeks 
the  problem  was  more  a  function  of  differences  between  coders. 

The  intercoder  agreement  for  Factor  2  (Parent-infant  interaction 
with  baby  responding)  showed  a  slighfly  different  situation.  Again, 
the  generalizability  coefficient  was  best  at  37  weeks  (.79),  while 
considerably  less  at  25  and  43  weeks  (.56  and  .44,  respectively). 
However,  in  this  case  the  variance  due  to  blocks  remained  steady  from 
25  weeks  to  37  weeks;  it  then  dropped  dramatically  at  43  weeks.  The 
fact  that  the  variance  due  to  the  coder  x  block  interaction  fluctuated 
widely  over  the  three  occasions  indicates  some  problem  in  intracoder 
stability. 
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The  analysis  of  intercoder  agreement  for  Factor  3  (General  baby 
behavior)  shows  still  a  different  situation.     In  this  case  the  general- 
izability  coefficients  for  25  weeks  and  43  weeks  were  both  moderate 
(.65  and  .44,  respectively),  while  the  coefficient  at  37  weeks  was  much 
higher  (.89).    Notice  that  the  problem  with  differences  between  subjects 
was  similar  to  that  for  Factor  1  but  the  variance  due  to  different 
ratings  for  subjects  increased  dramatically  at  43  weeks.    This  factor 
was  composed  only  of  measures  of  baby  behavior  and  is,  therefore,  some 
indication  of  the  lack  of  parent  verbal  behavior.     It  is  possible,  there 
fore,  that  one  coder  was  not  coding  as  much  parent  verbal  behavior  as 
the  other  coder. 

The  analysis  of  intercoder  agreement  for  Factor  4  (Parent -infant 
interaction  with  baby  warming,  accepting)  showed  a  pattern  similar  to 
Factor  1.    The  generalizability  coefficient  for  25  weeks  was  low  (.21), 
it  increased  at  37  weeks  (.85)  and  then  decreased  at  43  weeks  (0).  The 
lack  of  variance  between  subjects  and  the  wide  variation  in  the  coders' 
ratings  of  subjects  again  both  seem  to  be  problems.    Notice  that  at  43 
weeks  the  variance  due  to  differences  between  subjects  decreased  by 
62%  while  the  variance  due  to  ratings  by  coders  for  subjects  increased 
over  200%  resulting  in  a  generalizability  coefficient  of  0, 

The  analysis  of  intercoder  agreement  for  Factor  5  (Parent -infant 
interaction  with  baby  initiating)  showed  a  pattern  similar  to  that  of 
Factor  2  in  that  the  variance  due  to  blocks  remained  stable  from  25  to 
37  weeks  while  the  variance  due  to  different  ratings  by  coders  decreased 
resulting  in  a  larger  generalizability  coefficient  (.79  and  .93,  respec- 
tively) .    However,  in  this  case  the  variance  due  to  differences  between 
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subjects  decreased  dramatically  at  43  weeks,  while  the  variance  due  to 
ratings  by  coders  shifted  dramatically  from  occasion  to  occasion, 
resulting  again  in  a  generalizability  coefficient  of  0. 

In  summary,  there  were  apparently  some  severe  problems  with  respect 
to  intercoder  agreement  at  both  25  and  43  weeks.     As  stated  previously, 
the  lack  of  variance  between  subjects  and  the  lack  of  intracoder 
stability  seem  to  be  the  major  sources  of  the  problems.    Only  Factor  5 
showed  an  acceptable  level  at  either  of  these  two  occasions. 

All  of  the  generalizability  coefficients  show  improvement  at  37 
weeks.    From  an  analysis  of  the  means  for  the  coder  facet,  it  would 
appear  that  there  was  a  systematic  bias  on  the  part  of  coder  1  to  assign 
higher  scores  for  Factor  3  and  lower  scores  for  Factor  4,  yet  both 
generalizability  coefficients  were  larger  than  .80.    This  could  happen 
if  both  coders  were  ranking  subjects  in  a  similar  manner  even  though 
one  was  consistently  assigning  higher  scores. 
Reliability 

Subjects  crossed  with  coders.    The  original  reliability  analysis 
for  subjects  crossed  with  coders  also  had  a  large  number  of  negative 
estimates  of  variance  components  for  the  facet  of  Parent  Sex,  and, 
therefore,  a  second  analysis  was  done.    The  full  model  for  the  randomized 
block  factorial  design  used  in  this  analysis  is  the  same  as  that  shown 
in  Table  7  with  P_.  removed;  a  block  in  this  model  indicates  a  parent- 
infant  pair.    The  algorithms  for  computing  the  Expected  Mean  Squares 
and  the  Estimates  of  the  Variance  Components  are  the  same  as  shown  in 
Table  8  with  all  references  to  Parent  Sex  removed;  in  this  case  the 
coder  x  Occasion  x  block  interaction  (cOb)  is  the  residual. 
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The  components  of  variance  for  each  factor  are  shown  in  Tables 
42-46.    Using  this  model  there  were  only  five  negative  estimates  (three 
involving  the  main  effect  of  coder  and  two  involving  the  coder  x  Occasion 
interaction),  with  the  largest  being  -.58.    The  largest  estimates  for 
Factors  1,  3  and  5  were  for  either  the  main  effect  of  Occasion  or  the 
Occasion  x  block  interaction;  for  Factor  2  they  were  the  Occasion  x 
block  interaction  and  the  residual;  for  Factor  4  they  were  the  main 
effect  of  Occasion  and  the  residual.    The  algorithms  for  computing  the 
generalizability  coefficients  for  this  design  are  shown  in  Table  47. 

None  of  the  factors  had  a  value  above  .70  for  the  coefficient 

2 

p  (b,c,0)  so  it  was  necessary  to  look  to  a  first-order  interaction 

(see  Table  48).     Factors  1,  3  and  5  had  values  above  .70  for  the  coefficient 

2 

p  (b,c,0*)  which  was  an  estimate  of  reliability  when  generalization  was 

intended  over  all  parent-infant  pairs  and  coders,  but  to  only  one  level 

of  Occasion  (.79,   .75,   .78,  respectively).     It  was  not  until  generalization 

was  restricted  to  all  parent-infant  pairs,  this  specific  set  of  coders 

and  one  level  of  Occasion  that  the  generalizability  coefficient  was 

above  .70  for  Factors  2  and  4  (.81,  and  .80,  respectively). 

Subjects  nested  within  coders.    The  original  reliability  analysis 

for  subjects  crossed  with  coders  had  a  large  number  of  negative  estimates 

of  variance  components  for  the  facets  of  Infant  Sex  and  Parent  Sex,  and, 

therefore,  these  were  eliminated  from  the  model  and  a  second  analysis 

was  done.    The  full  model  for  the  three-facet  split-plot  design  used  in 

this  analysis  is  the  same  as  that  shown  in  Table  10  with  L   and  P. 

k  3 

removed;  subjects  in  this  model  refers  to  a  parent-infant  pair.  The 
algorithms  for  computing  the  Expected  Mean  Squares  and  the  Estimates  of 
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Variance  Components  are  the  same  as  shown  in  Table  11  with  all  reference 
to  Infant  Sex  and  Parent  Sex  removed;  in  this  case  ATs(c)  is  the  residual. 

The  components  of  variance  for  each  factor  are  shown  in  Tables  49- 
53.    Using  this  model  there  were  only  seven  negative  estimates  with  the 
largest  being  -.79.    The  largest  estimate  for  each  factor  was  for  the 
residual  with  the  exception  of  Factor  5  where  the  largest  estimate  was 
for  the  main  effect  of  Type  of  Task. 

The  algorithm  for  computing  the  generalizability  coefficients  for 

this  design  are  shown  in  Table  54.    Notice  that  since  all  subjects  were 

not  observed  by  all  coders  the  variance  due  to  coders  was  included  as 

part  of  the  observed  score  variance.     In  this  case  variance  due  to  coders 

is  confounded  with  variance  due  to  groups  and,  therefore,  the  estimates 

of  the  generalizability  coefficients  will  be  conservative.    None  of  the 

2 

coefficients  had  a  value  above  .70  for  the  coefficient  p  (s,c,A,T)  so 
it  was  necessary  to  look  to  a  first-order  interaction  (see  Table  55) . 
None  of  the  factors  had  a  value  above  .70  for  either  of  the  two  first-order 
interaction  coefficients  so  that  it  was  necessary  to  look  to  the  second- 
order  interaction.    Factors  3  and  5  had  values  of  .78  and  .73,  respec- 

2 

tively,  for  the  coefficient  p  (s,c,A*,T*),  but  the  value  of  the  coef- 
ficients for  Factors  1,  2,  and  4  did  not  reach  .70  (.66,  .63  and  .59, 
respectively) . 

Summary 

Thirty-two  measures  of  parent-infant  interaction  from  the  Reciprocal 
Category  System  were  factor  analyzed  and  rotated  to  the  Varimax  criterion. 
The  resulting  five  factors  accounted  for  75%  of  the  common  variance  and 
were  basically  determined  by  variance  due  to  the  babies'  behavior. 
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Three  separate  analyses  were  done:     one  for  intercoder  agreement 
and  two  for  reliability.     In  each  of  these  analyses  there  were  negative 
estimates  associated  with  either  Parent  Sex  or  Infant  Sex  or  both. 
Therefore,  three  additional  analyses  were  done  which  eliminated  these 
two  facets  from  the  models. 

In  the  second  analysis  of  intercoder  agreement  (where  a  block 
equaled  a  parent-infant  pair)  it  was  shown  that  in  nine  of  ten  cases 
for  25  and  43  weeks  the  largest  estimated  variance  component  was  for 
the  coder  x  block  interaction.    However,  at  37  weeks  the  largest  esti- 
mated variance  component  for  all  cases  was  for  block.  Consequently, 
the  generalizability  coefficients  estimating  intercoder  agreement  at 
37  weeks  were  all  above  .70.    However,  at  25  and  43  weeks  only  the 
coefficient  for  Factor  5  at  25  weeks  was  above  .70. 

In  the  second  analysis  of  reliability  when  blocks  (a  block  equaled 
a  parent -infant  pair)  were  crossed  with  coders  it  was  shown  that  the 
largest  estimates  of  variance  components  for  Factors  1,  3  and  5  were 
for  either  the  main  effect  of  Occasion  or  the  Occasion  x  block  interaction 
for  Factor  2  they  were  the  Occasion  x  block  interaction  and  the  residual; 
for  Factor  4  they  were  the  main  effect  of  Occasion  and  the  residual. 
For  Factors  1,  3  and  5  the  values  of  the  generalizability  coefficients 
were  above  .70  when  generalization  was  intended  to  all  parent-infant 
pairs  and  all  coders,  but  to  only  one  level  of  Occasion  (.79,   .75  and 
.78,  respectively).     For  Factors  2  and  4  the  values  of  the  generalizability 
coefficients  were  larger  than  .70  only  when  generalization  was  intended 
to  all  parent-infant  pairs,  but  to  the  specific  coders  used  in  this 
analysis  and  one  specific  level  of  Occasion  (.81  and  .80,  respectively). 
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In  the  second  analysis  of  reliability  when  subjects  (  a  subject 
being  a  parent-infant  pair)  were  nested  within  coders  it  was  shown  that 
the  largest  estimate  of  variance  component  for  each  factor  was  for  the 
residual  with  the  exception  of  Factor  5  where  the  largest  estimate  was 
for  the  main  effect  of  Type  of  Task.     It  was  pointed  out  that  in  this 
analysis  the  variance  due  to  coders  was  confounded  with  the  variance 
due  to  group  so  that  the  estimates  of  the  generalizability  coefficients 
would  be  conservative.    Nevertheless,  the  coefficients  for  Factors  3 
and  5  were  above  .70  when  generalization  was  intended  to  all  parent- 
infant  pairs,  but  to  a  specific  level  of  Age  of  Infant  and  a  specific 
level  of  Type  of  Task  (.78  and  .73,  respectively).     The  coefficients 
for  Factors  1  and  2  were  .66  and  .63,  respectively  while  the  coefficient 
for  Factor  4  was  .59. 


CHAPTER  V 
DISCUSSION  AND  CONCLUSIONS 


The  purpose  of  this  study  was  to  investigate  the  generalizability 
of  factors  of  parent-infant  interaction.    There  were  two  major  aspects: 
first,  the  generation  of  factor  scores;  second,  the  analysis  of  the 
generalizability  of  those  factor  scores. 

Factors  of  Parent-Infant  Interaction 

Five  factors  were  derived  from  32  measures  obtained  by  systematic 

observation  of  video-tapes  using  the  Reciprocal  Category  System  (RCS) . 

These  measures  accounted  for  84%  of  the  total  tallies  recorded  by  the 

observers  and  were  an  elaboration  of  two  parent -infant  patterns  reported 

by  Gordon  (1975).    The  first  pattern  was  called  Ping-Pong  and  was 

described  as  looking  like  a  Ping-Pong  game:     Parent  does  something,  baby 

does  something,  parent  does  something,  baby  does  something.     Four  of 

the  five  factors  derived  in  this  project  fit  the  Ping-Pong  pattern.  On 

Factor  1  the  baby  is  amplifying;  on  Factor  2  the  baby  is  responding;  on 

Factor  4  the  baby  is  warming  or  accepting;  and  on  Factor  5  the  baby  is 

initiating.    On  each  of  these  factors  all  of  the  parent  variables  that 

were  used  (accepts,  amplifies,  elicits,  initiates  and  directs)  loaded 

together  on  the  same  factor  dependent  upon  the  baby  behavior.  This 

confirms  previous  work  by  Gordon  (1974),  Barkow  et  al .   (1975)  and 

Clark-Stewart  (1973),  which  indicated  that  parent  behaviors  tend  to  be 

highly  correlated  and  therefore  cluster  together. 
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A  second  pattern  identified  by  Gordon  (1975)  was  called  Persis- 
tence and  was  described  as  the  child  being  permitted  or  encouraged 
to  continue  an  activity  on  his  or  her  own.     Factor  3,  which  was  labeled 
General  Baby  Behavior,  corresponds  to  this  second  pattern. 

A  third  pattern  was  called  Professor  and  described  as  parent 
talking  followed  by  parent  talking  without  paying  attention  to  whether 
or  not  anyone  is  tuned  in.    However,  this  pattern  was  not  seen  with 
enough  frequency  in  the  present  project  to  be  considered.    All  of  the 
parents  in  this  project  seemed  to  be  particularly  attentive  to  their 
infants.    This  was  probably  due  to  differences  in  the  two  types  of 
projects.     In  the  earlier  project  (Gordon  and  Jester,  1972)  low-income 
mothers  were  being  taught  to  play  with  their  infants  by  parent  educator 
whereas  in  the  present  project  (Gordon  and  Soar,  in  progress)  middle- 
class  mothers  and  fathers  were  playing  with  their  infants  alone. 

Generalizability  Analysis 

Designs 

To  date  there  have  been  no  published  studies  by  researchers  using 
systematic  observation  to  study  parent-infant  interaction  which  have 
reported  more  than  intercoder  agreement  coefficients  as  measures  of 
reliability  even  though  that  method  has  been  shown  to  be  inappropriate 
(see  Medley  and  Mitzel,  1963;  Cronbach  et  al.,  1963).    The  purpose  of 
this  study  was  to  use  three  separate  designs  to  investigate  the  general 
zability  of  factor  scores  of  parent-infant  interaction. 

The  first  design  was  a  randomzied  block  factorial  design  with  the 
two  facets  being  coder  and  Parent  Sex  and  a  block  designated  as  an 
infant.    The  data  were  analyzed  at  three  separate  occasions:     25  weeks, 
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37  weeks,  and  43  weeks.     Because  of  the  large  number  of  negative  estimates 
of  variance  components  for  Parent  Sex,  a  second  analysis  was  done  with 
the  single  facet  of  coder  and  a  block  designated  as  a  parent-infant  pair. 
In  this  second  analysis  there  were  only  five  negative  estimates  out  of 
the  45  produced,  four  of  which  involved  the  main  effect  of  coder. 

The  generalizability  coefficients  were  estimates  of  intercoder 
agreement  and  are  shown  in  Table  41.     The  value  of  the  coefficients  was 
above  .70  for  all  factors  at  37  weeks.    However,  at  25  and  43  weeks 
only  the  value  for  Factor  5  was  above  .70. 

The  second  design  was  again  a  randomized  block  factorial  design 
which  provided  an  analysis  of  reliability  of  factor  scores  of  parent- 
infant  interaction  when  subjects  were  crossed  with  coders.    The  three 
facets  were  coders.  Parent  Sex  and  Occasion  and  a  block  was  designated 
as  an  infant.    There  were  again  a  large  number  of  negative  estimates  of 
variance  components  associated  with  the  facet  of  Parent  Sex,  so  that 
facet  was  eliminated  and  a  second  analysis  done.    Using  this  second 
model  there  were  only  five  negative  estimates  out  of  the  35  produced, 
three  involving  the  main  effect  of  coder  and  two  involving  the  coder  x 
Occasion  interaction.     The  three  Occasions  used  were  again  25  weeks, 
37  weeks  and  43  weeks. 

The  generalizability  coefficients  were  estimates  of  reliability 
and  are  shown  in  Table  48.     Factor  1   (Parent -infant  interaction  with 
baby  amplifying).  Factor  3  (General  Baby  Behavior)  and  Factor  5  (Parent- 
infant  interaction  with  baby  initiating)  all  had  values  above  .70  for 

2 

the  coefficient  p  (b,c,0*),  which  was  an  estimate  of  reliability  when 
generalization  was  intended  over  all  parent-infant  pairs  and  all  coders. 
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but  to  a  specific  level  of  Occasion.     Factor  2  (Parent -infant  interaction 

with  baby  responding)  and  Factor  4  (Parent -infant  interaction  with  baby 

2 

warming,  accepting)  had  values  above  .70  for  the  coefficient  p  (b,c*,0*), 
which  was  an  estimate  of  reliability  when  generalization  was  intended 
over  all  parent -infant  pairs,  but  to  the  specific  coders  used  in  this 

study  and  to  one  level  of  Occasion.    However,  Factor  2  had  a  value  of 

2 

.69  for  the  coefficient  p  (b,c,0*),  so  that  it  was  really  only  Factor  4 
which  seemed  to  have  substantial  problems  with  reliability. 

The  third  design  was  a  five-facet  split-plot  design  using  the 
facets  of  coder.  Infant  Sex,  Parent  Sex,  Age  of  Infant  and  Type  of  Task. 
This  design  provided  an  analysis  of  reliability  of  factor  scores  when 
subjects  (designated  as  infants)  were  nested  within  coders.  Again, 
there  were  a  large  number  of  negative  estimates  of  variance  components 
and  a  second  analysis  was  done  using  only  the  facets  of  coder,  Age  of 
Infant  and  Type  of  Task;  subject  in  this  model  was  designated  as  a 
parent-infant  pair.     In  this  second  analysis  there  were  seven  negative 
estimates  out  of  the  55  produced  with  only  two  being  larger  in  absolute 
value  than  -.20. 

It  was  anticipated  that  by  partitioning  Occasion  into  two  separate 
facets,  one  of  Age  and  one  of  Task  difficulty,  the  factor  scores  would 
prove  to  be  reliable  across  one  or  the  other  facet,  and  it  would  not  be 
necessary  to  limit  generalization  to  an  occasion.     However,  such  was 
not  the  case.    An  analysis  of  the  estimates  of  variance  components  shown 
in  Tables  49-53  shows  that  estimates  for  Age  x  subj ects (coders)  and 
Task  X  subj ects (coders)  were  both  substantial  for  all  factors  irrespec- 
tive of  the  size  of  the  estimate  for  the  main  effect  of  Age  or  Task. 
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For  example,  for  Factor  1  the  estimate  of  variance  component  for  Age  is 
7.93  while  the  estimate  for  Task  is  -.18.    Yet  the  estimate  for  As(c) 
is  8.42  and  the  estimate  for  Ts(c)  is  11.33.    For  Factor  5  the  estimate 
of  variance  component  for  Age  is  .79  and  for  Task  is  45.96,  while  the 
estimate  for  As(c)  is  18.36  and  the  estimate  for  Tsfc)  is  26.29. 

The  generalizability  coefficients  were  again  estimates  of  reliability 
and  are  shown  in  Table  55.     Only  Factors  3  and  5  had  values  above  .70 

2 

for  the  coefficient  p  (s,c,A*,T*).     There  are  at  least  two  possible 
explanations  for  the  discrepancy  in  findings  between  this  analysis  and 
the  previous  analysis.     First,  this  analysis  included  the  19  week  session 
whereas  the  other  analysis  estimating  reliability  did  not.     It  is  pos- 
sible that  differences  between  coders  at  this  earlier  session  were 
substantial  and,  therefore,  the  reliability  was  reduced  accordingly. 
Unfortunately,  there  is  not  enough  matched  data  at  this  session  to  test 
this  possible  explanation,  but  the  intercoder  agreement  analysis  at  25 
weeks  would  seem  to  lend  it  some  support. 

A  second  explanation  could  be  that  since  in  this  second  design 
variance  due  to  coders  and  variance  due  to  groups  to  which  coders  were 
assigned  was  completely  confounded,  the  inclusion  of  the  variance  due 
to  coders  as  part  of  observed  score  variance  (see  Table  54)  is  making 
the  estimate  of  reliability  too  conservative.    Accordingly,  the  general- 
izability coefficients  were  computed  with  a  different  definition  of 
observed  score  variance  and  the  two  sets  of  coefficients  are  shown  in 
Table  56.    The  only  notable  difference  is  that  the  value  of  the  coefficient 

2 

p  (s,c,A*,T*)  for  Factor  1  under  method  B  is  above  .70. 
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2 

The  coefficient  p  (s,c,A*,T*)  in  the  split-plot  design  is  comparable 

2 

to  the  coefficient  p  (s,c,0*)  in  the  randomized  block  factorial  design. 
Table  57  is  a  comparison  of  generalizability  coefficients  under  the  three 
conditions  mentioned  previously.     Notice  that  for  Factors  1,  2  and  5  the 
coefficient  generated  under  the  split-plot  design  is  lower  than  that 
generated  under  the  factorial  design,  whereas  for  Factors  3  and  4  the 
opposite  is  true.    However,  if  .70  is  used  as  the  cutoff  point  in  deter- 
mining whether  or  not  a  factor  is  reliable,  only  the  decision  concerning 

2 

Factor  1  would  change  under  the  A  condition  for  the  coefficient  p  (s,c,A*,T*) 

2 

when  compared  to  the  coefficient  p  (s,c,0*);  if  the  B  condition  were  used, 
the  decisions  made  about  the  reliability  of  the  factor  scores  would  be 
exactly  the  same. 
Analysis  of  Reliability 

Medley  and  Mitzel  (1963)  identified  two  major  reasons  for  the  un- 
reliability of  data.    The  first  was  that  separate  measures  of  the  same 
subject  tend  to  differ  too  much  across  occasions.     There  were  three 
major  aspects  of  these  differences:     (1)  items  which  enter  into  measure- 
ment lack  consistency;   (2)  there  is  a  lack  of  coder  agreement;  and  (3) 
the  behavior  of  the  subject  is  unstable.    The  second  major  reason  was 
that  differences  between  subjects  were  too  small.    The  following  is  an 
analysis  of  the  five  RCS  factors  according  to  these  criteria. 

Item  consistency.    There  were  several  RCS  behaviors  with  which  we 
had  difficulty  during  coder  training  which  may  be  defined  as  "item 
consistency"  problems.    One  was  the  item  "01  -  Baby  Warms"  (see  Table  2). 
This  was  supposedly  a  measure  of  non-task  related  behavior  in  the  young 
infant.    Also,  part  of  the  definition  for  this  behavior  included 
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"self-reinforcing  behavior"  such  as  thumb  sucking  or  putting  a  toy  in  the 

mouth.    However,  was  the  infant  who  waved  a  hand  in  front  of  his/her 

face  and  then  put  it  in  the  mouth  exhibiting  "non-task  related  positive 

affect"?    If  the  coder  must  decide  whether  that  behavior  is  initiating 

(behavior  with  no  task-related  antecedent),  or  warms,  it  is  easy  to  see 

why  there  would  be  lack  of  coder  agreement.    My  recommendation  would  be 

to  change  the  definition  of  this  item  to: 

Baby  warms  -  task  or  non-task  related  behavior  in  which  the  infant 
smiles,  laughs,  gurgles,  cooes,  etc.   (i.e.,  actively  exhibits 
positive  affect) .    This  behavior  may  be  seen  at  the  same  time  as 
other  infant  behavior  in  which  case,  if  there  is  parent  verbal  be- 
havior, that  parent  behavior  would  be  double  coded.    For  example, 
if  a  baby  reaches  out  and  grabs  a  rattle  while  smiling  at  the  same 
time  in  response  to  parent  verbal  behavior,  the  proper  coding  would 
be:     05  parent  behavior  01  parent  behavior. 

This  would  eliminate  the  problem  of  defining  what  is  "self-reinforcin 
to  the  infant  and  would  also  eliminate  the  necessity  of  distinguishing 
between  task-  and  non-task-related  behavior. 

A  second  item  that  caused  some  difficulty  was  "02  -  Baby  accepts" 

(see  Table  2) .    The  less-than-six-month-old  infant  would  quite  frequently 

look  at  what  the  parent  was  doing,  glance  away  briefly,  again  look  at 

the  parent,  again  glance  away,  etc.     If  one  tried  to  code  every  behavior 

change  it  became  almost  impossible  to  get  any  intracoder  stability,  let 

alone  intercoder  agreement.    Therefore,  I  would  add  to  the  definition: 

Baby  accepts  -  infant  must  be  orienting  visually  to  parent.  If 
infant  glances  away  briefly,  but  is  spending  the  majority  of  the 
time  period  observing  the  parent,  then  record  only  one  behavior. 

A  third  problem  that  arose  was  the  coding  of  baby  babbling.  It 

was  very  difficult  to  distinguish  between  baby  sounds  except  in  relation 

to  other  on-going  behavior.     Therefore,  I  would  add  an  addendum  to  the 
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description  of  baby  behavior  (see  Table  2) : 

Baby  babbling  -  (a)  code  as  baby  responds  if  the  babbling  follows 
a  parent  verbal  behavior  such  as  accepts,  elicits,  initiates,  etc., 
(b)  code  as  baby  initiates  if  parent  behavior  does  not  immediately 
preceed,   (c)  code  as  baby  amplifies  if  the  baby  sound  builds  upon 
(is  noticeably  different  from)  previous  sounds. 

A  fourth  problem  was  the  coding  of  parent  behaviors  such  as  clicking, 
cooing,  or  other  non-speech  related  sounds.    They  were  quite  obviously 
verbal  behavior,  but  their  meaning  was  not  at  all  clear.    My  recommendation 
would  be  to  code  these  as  parent  initiates  although  a  case  could  also 
be  made  for  coding  these  as  elicits  or  amplifies  depending  upon  the 
circumstances . 

Overall  1  believe  that  the  descriptions  of  behaviors  for  the  RCS 
items  were  quite  functional.    For  the  vast  majority  of  behaviors  exhibited 
by  both  infant  and  parent  it  was  not  too  difficult  to  come  to  some  sort 
of  agreement  if  the  video-tape  of  a  session  was  viewed  by  two  or  more 
persons  at  the  same  time  and  the  observers  were  allowed  to  discuss  how 
the  behavior  was  to  be  coded.    The  major  problem  seemed  to  be  inconsistent 
coding  of  the  more  subtle  behaviors  by  one  of  the  observers  when  coding 
alone . 

Coder  Agreement .    A  second  part  of  the  problem  of  reliability  has 
to  do  with  lack  of  coder  agreement.     The  second  analysis  of  intercoder 
agreement  showed  that  the  factor  scores  of  parent-infant  interaction  at 
37  weeks  were  quite  acceptable.    However,  the  coefficients  at  25  and  43 
weeks  were  not  acceptable,  with  the  exception  of  Factor  5  at  25  weeks. 
At  25  weeks  it  is  possible  that  the  ambiguity  of  the  babies'  behavior 
was  a  problem  for  all  factors  in  addition  to  the  lack  of  variance  between 
subjects  for  Factors  1  and  3.    At  43  weeks,  however,  differences  between 
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coders  were  a  major  problem.    This  was  not  likely  a  result  of  a  lack  of 
training  or  ambiguity  of  behaviors  since  the  intercoder  agreement  at  37 
weeks  was  high.     It  was  more  likely  a  result  of  one  of  the  coders  attempting 
to  code  too  quickly.    Our  experience  indicated  that  it  took  about  50 
minutes  to  code  10  minutes  of  tape  and  that  two  hours  of  coding  at  one 
time  was  about  the  maximum  that  one  could  code  before  becoming  less 
accurate.    One  coder  got  behind  because  of  a  heavy  class  load  and  ap- 
parently was  coding  too  quickly  in  a  effort  to  catch  up.    The  difficulty 
in  keeping  these  types  of  coding  errors  under  control  is  emphasized  by 
the  fact  that  most  of  the  data  for  the  43  week  session  was  coded  in  about 
three  weeks. 

It  is  my  opinion  that  much  of  the  coder  disagreement  was  a  result 
of  two  different  problems.    One  was  initial  differences  between  coders, 
and  the  second  was  the  instability  of  coder  behavior.    With  respect 
to  the  first  problem,  one  coder  seemed  absolutely  compulsive  about  coding 
every  single  behavior  while  the  other  was  much  less  so.    Tliroughout  the 
coding  we  had  frequent  sessions  where  the  coders  would  first  code  a  piece 
of  tape  separately  and  then  code  it  together.     Invariably,  there  would 
be  differences  which  would  be  ironed  out  only  to  reappear  at  the  next 
meeting.    As  mentioned  above,  the  second  problem  was  basically  a  result 
of  external  pressures  on  the  coders,  especially  pressures  centering  around 
getting  the  job  completed  on  time.     Intracoder  stability  is  not  a  problem 
that  has  been  extensively  discussed  in  this  literature.     Perhaps  more 
work  is  needed  in  this  area. 

Stability  of  behavior.    A  third  part  of  the  problem  of  measures 
differing  across  occasions  is  that  the  behavior  of  the  subject  is  unstable. 
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Another  way  of  saying  this  is  that  the  behavior  either  varies  unsystema- 
tically  or  varies  systematically  across  levels  of  variables  for  which  we 
have  not  accounted.    Medley  and  Mitzel  (1963)  and  McGaw  et  al .  (1972) 
both  state  that  it  is  this  third  source  of  error  variance  which  is  the 
most  important  and  the  present  study  confirms  their  findings.     Table  58 
shows  the  sums  of  squares  and  the  percent  of  total  sums  of  squares  (a 
measure  of  variance  explained)  for  each  source  of  variance  for  the  five 
factors  when  subjects  were  crossed  with  coders.    Notice  that  the  main 
effect  of  coder  accounts  for  one  percent  or  less  for  each  factor  except 
Factor  3,  where  it  accounts  for  three  percent.    The  coder  x  block  (a 
block  being  designated  as  a  parent-infant  pair)  accounts  for  between  four 
and  eight  percent.    However,  the  main  effect  of  Occasion  accounts  for 
over  25  percent  with  the  exception  of  Factor  2,  and  the  Occasion  x  block 
interaction  accounts  for  between  21  and  43  percent. 

Table  59  is  a  summary  of  the  same  information  when  subjects  were 
nested  within  coders.    Although  the  numbers  are  different  because  of  the 
confounding  of  the  variance  due  to  coder  and  groups  to  which  coders  were 
assigned  and  the  inclusion  of  an  extra  session  in  the  split-plot  analysis, 
the  overall  story  is  still  the  same.    With  the  exception  of  Factor  4,  the 
variance  due  to  the  main  effect  of  coder  is  substantially  less  than  one 
of  the  main  effects  of  either  Age  of  Infant  or  Type  of  Task.  Comparing 
the  results  of  the  two  analyses,  it  appears  that  the  original  differences 
between  groups  to  which  coders  were  assigned  was  influencing  the  variance 
due  to  coders  such  that  differences  were  magnified,  whereas  for  Factor  3 
the  influence  is  such  that  those  differences  were  being  concealed. 
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Even  though  the  facets  used  in  each  of  the  analyses  explains  some 
portion  of  the  variance  for  each  of  the  factors,  there  is  still  a  large 
portion  of  variance  which  is  not  explained.     It  is  possible  that  the 
behavior  is  truly  "unstable"  but  there  is  also  the  possibility  that  some, 
as  yet  unidentified,  facet  can  account  for  additional  variance.  One 
possibility  might  be  a  measure  of  infant  temperament  such  as  that  de- 
veloped by  Thomas  et  al .   (1963)  and  refined  by  Pederson,  Anderson  and  Cain 
(1976).    Another  possibility  might  be  a  measure  of  parent  sex-role  as 
developed  by  Bern  and  her  associates  (1974,  1976) .     In  any  case  there  is 
still  much  parent-infant  interaction  variance  as  measured  by  these  five 
factors  to  be  explained  over  and  above  that  accounted  for  by  the  five 
facets  used  in  this  study. 

Differences  between  subjects.    A  second  major  reason  cited  by 
Medley  and  Mitzel  (1963)  for  measures  being  unreliable  was  that  differences 
between  subjects  are  too  small.     In  the  intercoder  agreement  analysis 
this  was  cited  as  a  possible  cause  of  difficulty  for  Factors  1  and  3 
at  the  early  age  grouping  although  overall  it  was  not  as  significant 
a  problem  as  one  might  expect  when  studying  as  restricted  a  group  as  we 
had.    With  respect  to  the  two  analyses  of  reliability  it  would  appear 
that  Factors  2  and  4  were  the  ones  that  were  most  affected  by  this 
problem.     It  was  especially  crucial  for  Factor  4  since  the  coder  x  block 
interaction  in  the  analysis  where  subjects  were  crossed  with  coders 
accounted  for  almost  as  much  variance  as  did  blocks  (8%  and  11%,  respec- 
tively); in  the  analysis  where  subjects  were  nested  within  coders  the 
main  effect  for  coders  was  larger  than  the  main  effect  of  blocks  (16%  and 
11%,  respectively). 
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Table  60  provides  a  summary  judgment  of  sources  of  unreliability 
for  each  factor  of  parent-infant  interaction.    The  subjects'  behavior 
was  unstable  to  the  extent  that  the  facets  of  coder  and/or  Occasion 
(or  Infant  Age  Grouping  and  Type  of  Task)  had  to  be  considered  before 
the  generalizability  coefficients  computed  for  each  facet  in  the  reliabil- 
ity studies  rose  above  .70.    Observer  disagreement  was  cited  as  a  problem 
at  both  the  25  week  and  43  week  session  according  to  the  intercoder  agree- 
ment analysis.    The  first  reliabilty  analysis  (where  subjects  were  crossed 
with  coders)  suggested  that  there  may  be  some  problem  with  coder  disagree- 
ment for  Factor  3.    The  second  reliability  analysis  (where  subjects  were 
nested  within  coders)  suggested  there  may  be  some  problem  for  Factor  1. 
However,  both  reliability  analyses  suggested  there  were  substantial 
problems  with  coder  disagreement  for  Factor  4.    Little  variance  between 
subjects  was  also  cited  as  a  possible  problem  for  Factors  1  and  3  before 
37  weeks  and  a  possible  problem  for  Factor  2  overall.     Items  lacking 
consistency  v\ras  also  cited  as  a  problem  for  Factors  1,  3,  and  4. 

Additional  Research 

One  of  the  main  reasons  for  investigating  the  generalizability  of 
measures  is  to  provide  some  information  about  the  reliability  of  measures 
prior  to  their  being  used  in  a  decision  study.    One  type  of  decision 
study  that  could  be  done  would  be  to  attempt  to  establish  the  predictive 
validity  of  these  measures  by  studying  their  relationship  with  an  infant 
competence  measure  such  as  the  Bayley  MDQ.    Also,  the  investigation  of 
the  generalizability  of  these  measures  could  be  useful  in  designing 
future  studies  of  parent-infant  interation.    As  explained  below,  this 
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study  provides  useful  information  for  both  of  these  endeavors. 

It  must  be  remembered  that  a  generalizability  coefficient  is  an 
estimate  of  the  relationship  of  an  observed  score  and  a  corresponding 
universe  score.    Generalization  across  all  levels  of  facets  represents 
a  "larger  universe"  in  that  one  can  generalize  a  subject's  score  to  any 
observation  being  considered.     If  there  is  little  or  no  generalizability 
for  a  subject's  score  in  this  case  then  one  is  forced  to  look  at  a 
first-order  coefficient  of  facet  by  subject  in  an  attempt  to  increase 
the  reliability.    However,  every  higher  order  coefficient  represents 
a  "smaller  universe"  over  which  one  can  generalize.     Therefore,  it  is 
better  to  pick  the  level  of  analysis  in  a  decision  study  as  low  in  the 
hierarchy  as  possible.     Unfortunately,  in  the  present  case  it  does  not 
seem  feasible  to  use  less  than  a  first -order  interaction  of  facet  by 
subject.     It  was  only  after  generalization  was  not  intended  to  all  levels 
of  Occasion  (or  Infant  Age  Grouping  and  Type  of  Task)  that  generaliza- 
bility coefficients  over  .70  were  attained. 

It  was  shown  that  the  main  effect  of  the  facet  of  Parent  Sex  was 
not  an  important  source  of  variance,  but  that  the  Parent  Sex  x  block 
interaction  was  important  for  Factors  1  and  3.     This  would  suggest  that, 
even  though  the  Parent  Sex  was  eliminated  in  the  additional  analyses,  it 
would  be  important  to  be  specific  as  to  the  parent  involved  when  using 
these  two  factors  in  a  D  study.     Unfortunately,  there  was  only  sufficient 
data  about  the  facet  of  Infant  Sex  to  state  that  it  did  not  appear  to 
account  for  enough  variance  to  warrant  its  inclusion  in  the  model. 

In  terms  of  planning  future  studies,  it  seems  appropriate  to  note 
the  relative  importance  of  Occasion  in  accounting  for  variance  as 
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compared  to  either  coder  or  Parent  Sex  in  the  design  where  coders  were  crossed 
with  subjects.     Using  intercoder  agreement  as  a  measure  of  reliability 
simply  does  not  consider  this  important  source  of  variance.    While  only 
six  subjects  were  used  in  this  analysis  compared  with  the  14  to  18  subjects 
used  at  each  session  in  the  intercoder  agreement  analysis  it  would  appear 
that  the  reliability  analysis  provided  additional  valuable  information. 
Notice  that  six  subjects  observed  at  three  sessions  by  two  observers 
is  equivalent  to  18  subjects  observed  at  one  session  by  two  observers. 
Since  resources  for  coding  seem  to  always  be  less  than  one  would  prefer, 
it  would  seem  preferable  to  observe  a  smaller  number  of  subjects  over 
several  occasions  rather  than  what  has  traditionally  been  done  with 
intercoder  agreement  analysis. 

It  also  seems  appropriate  to  note  the  relative  importance  of  Age  of 
the  Infant  and  the  Type  of  Task  in  accounting  for  variance  as  compared 
with  either  Parent  Sex  or  Infant  Sex  in  the  design  where  subjects  were 
nested  within  coders.    While  it  is  true  that  this  particular  design  is 
less  powerful  in  showing  differences  with  respect  to  Infant  Sex  than  the 
other  variables  (Kirk,  1968),  still  this  analysis  emphasized  similarities 
rather  than  differences  in  activity  level  and  parent-infant  interaction 
patterns  for  boys  and  girls  during  the  first  year  of  life.     It  also 
points  to  the  powerful  influence  of  situational  variables  in  influencing 
the  behavior  of  parents  and  infants  and  emphasizes  the  need  to  validate 
our  findings  in  a  wide  variety  of  contexts,  especially  in  the  home.  I 
would  suggest  four  different  types  of  tasks  which  might  prove  to  be 
important:     1)  Caretaking  activities  such  as  feeding,  bathing,  diapering, 
etc.,  2)  Freeplay  with  a  set  of  toys  appropriate  for  the  age  of  the  infant. 
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3)  Structured  teaching  which  involves  an  object,  and  4)  Structured  teaching 
which  does  not  involve  an  object,  such  as  making  sounds,  clapping  hands, 
singing  songs,  etc.     It  should  be  remembered  that  our  data  indicate  a 
larger  amount  of  score  variance  when  the  task  is  slightly  difficult  for 
the  infant.    Also,  Llabre  (1978)  showed  that  by  increasing  the  number  of 
occasions  to  be  sampled,  reliability  estimates  could  be  increased 
dramatically. 

Summary  and  Conclusions 
This  study  investigated  intercoder  agreement  and  reliability  of 
factor  scores  of  parent-infant  interaction  through  the  use  of  generaliza- 
bility  theory.    Three  different  designs  were  used:     one  for  intercoder 
agreement  at  three  different  occasions  and  two  for  reliability.     Each  of 
the  analyses  was  done  a  second  time,  eliminating  the  facets  of  Parent 
Sex  and/or  Infant  Sex.    The  information  derived  from  these  three  analyses 
1 e  d  to  the  conclusion  that  even  though  the  intercoder  agreement  coef- 
ficients were  substantial  at  37  weeks,  generalization  about  a 
parent-infant  pair's  score  could  not  be  made  from  one  occasion  to  another. 
It  was  also  shown  that  the  main  effects  of  Age  of  Infant  and  Type  of  Task 
accounted  for  more  variance  than  did  either  Sex  of  Parent  or  Sex  of 
Infant . 

With  respect  to  the  success  of  this  particular  study,  it  has  been 
shown  that  information  about  the  reliability  of  factors  of  parent-infant 
interaction  using  a  split-plot  design  can  approximate  that  derived  from 
a  randomized  design,  but  because  variance  due  to  coders  is  confounded 
with  variance  due  to  group  to  which  coder  is  assigned,  the  former  cannot 
be  substituted  for  the  latter.    However,  since  resources  for  coding  are 
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usually  limited,  an  investigator  using  a  factorial  design  must  use  a 
small  number  of  subjects,  whereas  with  the  split-plot  design  the  entire 
sample  of  a  study  can  be  used.     It  therefore  seems  appropriate  to  view 
these  two  designs  as  complementary  aspects  of  a  reliability  analysis 
rather  than  competitive.    By  doing  so  researchers  can  use  systematic 
observation  and  report  detailed  information  about  the  reliability  of 
their  data  without  going  to  the  extra  expense  of  double  coding  the 
entire  sample.    Consequently,  there  is  clearly  little  excuse  for  the 
continued  use  of  coefficients  of  intercoder  agreement  as  measures  of 
reliability. 


GLOSSARY  OF  TERMS 


GLOSSARY  OF  TERMS 


ANOVA 

Decision  study 


Facet 


Fixed  variable 


General izability 
coefficient 

Generalizability 
study 

Intercoder 
agreement 

Intrac lass 
correlation 
coefficient 

Intracoder 
stability 

fetched 
ratings 

MDQ 


Observation 
composite  or 
factor 


Observation 
instrument 


Short -hand  term  for  analysis  of  variance. 

A  study  from  which  decisions  are  made  as  to  the 
relationship  of  the  score  under  investigation  to 
other  measure  (compare  to  generalizability  study). 

A  source  of  score  variability  in  addition  to  the 
between  person  variability  (a  factor  or  independent 
variable  in  ANOVA  terminology  with  the  exception 
that  subject  is  never  considered  a  facet). 

A  variable  in  which  all  levels  about  which  inferences 
are  to  be  made  are  included  in  an  experiment. 

The  ratio  of  the  universe  score  variance  to  observed- 
score  variance. 

A  study  which  is  done  for  the  purpose  of  investigating 
the  relationship  between  an  observed  score  and  a 
universe  score  (compare  to  decision  study). 

The  correlation  between  scores  based  on  observations 
made  by  different  coders  on  the  same  occasion. 

The  ratio  of  two  variances  both  of  which  are 
estimates  of  the  same  population. 


The  correlation  between  scores  based  on  observations 
made  by  the  same  coder  on  different  occasions. 

The  case  where  every  subject  is  rated  by  all  coders. 


Bayley  Mental  Development  Quotient:  an  assessment 
of  infant  perceptual-motor  competence. 

A  procedure  for  combining  individual  scores  to  assign 
composite  or  factor  scores  to  each  of  the  subjects; 
it  is  assumed  that  these  scores  reflect  some 
characteristic  of  the  behavior  of  that  subject. 

A  set  of  procedures  whereby  an  observer  can  record 
and  categorize  the  behavior  of  a  subject  or  subjects. 
It  normally  consists  of  a  number  of  items,  to  which 
the  observer  responds  in  some  way  dependent  on  the 
behavior  that  was  observed. 
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Observation 
measure 


Observation 
record 


Observed 
score 

Occasion 

Random 
variable 


Reliability 
(Classical 
model) 

Reliability 
(from  Medley 
and  Mitzel, 
1963) 


Reliability 
coefficient 
(Classical 
model) 


Reliability 
coefficient 
(from  Medley 
and  Mitzel, 
1963) 


A  procedure  for  using  an  observation  record  to 
assign  scores  to  each  of  the  subjects  of  observation; 
each  score  so  assigned  is  assumed  to  reflect  some 
characteristic  of  that  subject. 

A  set  of  data  (usually  in  the  form  of  symbols) 

which  describes  the  behavior  of  one  or  more  subjects 

during  one  or  more  periods  of  observation. 

The  score  actually  obtained  when  rating  a  subject. 


A  single  point  in  time  when  an  observation  is  made. 

A  variable  in  which  a  random  sample  of  a  population 
of  levels  about  which  inferences  are  to  be  made  is 
included  in  an  experiment. 

The  extent  to  which  a  test  is  consistent  in  measuring 
whatever  it  does  measure;  dependability,  stability, 
relative  freedom  from  errors  of  measurement. 

The  extent  to  which  the  average  difference  between 
two  measurements  independently  obtained  for  the 
same  subject  (i.  e.,  obtained  on  two  separate 
occasions  by  two  different  observers)  is  smaller 
than  the  average  difference  between  two  measurements 
obtained  for  different  subjects. 

Generally,  the  coefficient  of  correlation  between 
two  forms  of  a  test,  between  scores  on  two  admin- 
istrations of  the  same  test,  or  between  halves 
of  a  test,  properly  corrected.    The  three  coefficients 
measure  somewhat  different  aspects  of  reliability 
but  all  are  spoken  of  as  reliability  coefficients. 

The  correlation  between  scores  based  on  observations 
made  by  different  observers  on  different  occasions. 
They  recommend  using  intraclass  correlation  coefficients 
derived  through  ANOVA. 


Universe 
score 


The  average  of  the  population  of  ratings  that  might 
be  made  for  an  individual. 


APPENDIX 
DESCRIPTION  OF 
STRUCTURED  TEACHING  ACTIVITIES 


19  Weeks 


Two-Way  Stretch 

This  game's  aim  is  to  give  your  baby  practice  in  controlling 
things  around  him  by  using  his  body. 

Take  the  toy  —  a  small  telephone  rattle  with  an  elastic  strip 
attached  —  and  dangle  it  near  the  baby.    Encourage  him  to  reach 
and  grab  for  it.    Use  such  words  as  "get,"  "grab,"  and  "catch"  while 
you're  playing  together. 

When  he  does  grasp  it,  pull  gently  away  so  there's  some  stretch 
between  you  and  him.     Get  into  a  push-pull  game  with  him,  saying,  "Pull. 
You'll  pull  and  I'll  pull."    Then  gently  release  it  and  repeat.  Try 
it  so  that  he  uses  both  hands. 

Be  sure  it's  fun  and  not  teasing.     Keep  the  toy  so  he  can  get  it 
when  he  makes  an  effort.     Remember  that  the  underlying  principle 
you  want  to  convey  to  your  baby  is  that  it's  worthwhile  trying  to  do 
things,  that  an  effort  on  his  part  can  have  gratifying  results. 

When  he  makes  sounds  of  pleasure  because  he  has  grabbed  it, 
respond  to  these  sounds  by  repeating  them.     Enjoy  his  enjoyment. 
Please  feel  free  to  sit  in  one  of  the  chairs  or  on  the  floor.  If 
you  have  an  infant  seat,  it's  quite  alright  to  use  it.     The  most 
important  thing  is  for  you  and  your  baby  to  be  as  comfortable  and 
natural  as  possible. 
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25  Weeks 

Mirror  and  Toy 

The  aims  of  this  game  are  to  help  the  baby  become  aware  of  his 
own  appearance  and  to  give  him  experiences  in  seeing  objects 
reflected  in  a  mirror. 

Place  your  baby  in  your  lap  so  that  he  is  facing  the  same 
direction  that  you  are.    Hold  a  mirror  so  that  he  can  see  himself. 

Point  to  his  reflection  and  say,  "1  see    (your  baby's  name)." 

"Where  is   ?"    "Find   ."    "Look  at   ." 

Pick  up  the  objects  on  the  tray,  one  at  a  time,  move  them  behind 
your  baby's  head  so  that  he  can  see  them  in  the  mirror  along  with 
himself.    Name  the  objects,  telling  something  about  the  object  such  as, 
"This  is  a  ball  and  it  is  round."    Then  say,  "IVhere  is  the  ball?"  as 
you  remove  it  from  the  mirror's  reflection. 
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37  Weeks 

Hide-and-Seek 

For  the  very  young  child,  out  of  sight  is  out  of  mind.  Now 
he's  ready  to  learn  that  things  exist  even  when  he  can't  see  them. 

Begin  with  a  simple  game  using  a  toy  and  some  soft  covering 
material,  such  as  a  blanket.    Attract  your  baby's  attention  to  the 
toy  and  then  partly  hide  it  under  the  blanket  so  your  baby  can  still 
see  a  part  of  it. 

Then  say,  "Where  did  it  go?"    "Find  the  Toy." 

If  he's  puzzled  and  doesn't  seem  to  know  how  to  retrieve  it, 
show  him  how.     If  he  ignores  the  toy  after  it  is  hidden,  play  with  it 
by  yourself  in  front  of  him,  but  don't  demand  his  attention  or  any 
action.    He  will,  on  his  ovm,  get  interested  in  what  you  are  doing. 

Partly  hide  it  again  until  he's  able  to  get  it  himself. 

Play  the  same  game,  but  hide  the  toy  completely  under  the  soft 
material  so  he  can  see  that  something  is  under  the  blanket.  Encourage 
him  to  lift  it  up  and  get  his  toy. 

Repeat  this  for  fun  a  number  of  times  and  then  leave  the  child 
with  both  toy  and  blanket. 
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43  Weeks 

Blocks 

Since  your  child  can  now  handle  small  objects  with  his  fingers 
rather  well,  he's  ready  for  block  play.     Blocks  are  perhaps  the  best 
of  all  possible  toys  because  he  can  do  so  many  things  with  them. 
Start  him  out  with  just  a  few. 

Place  two  blocks  in  front  of  him  while  you're  both  sitting  on 
the  floor  and  show  him  how  one  can  be  put  on  top  of  the  other.  Let 
him  do  it.    Then  add  a  third  so  he  can  build  a  simple  three-block 
tower.    Don't  worry  if  they're  not  directly  one  on  top  of  the  other. 
This  is  a  self-correcting  activity.     If  he  doesn't  build  well  enough, 
it  will  just  tumble  down.    He  will  enjoy  the  tumbling  as  much  as  the 
building. 

A  variation  of  this  is  to  show  him  how  you  can  place  two  or 
three  blocks  in  a  line  on  the  floor  and  push  them  around.     If  he  pushes 
on  the  third  one,  the  first  two  will  go  straight  for  a  few  seconds 
and  then  get  out  of  line.    He'll  enjoy  watching  this  happen,  and 
gradually  he'll  gain  the  skill  needed  both  to  build  the  tower  straight 
and  keep  the  blocks  in  line. 

You  can  also  make  up  your  own  variations  of  block  play.  The 
main  idea  is  to  encourage  him  to  develop  his  new  found  skills  and 
for  the  parents  and  child  to  enjoy  playing  together. 
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